Stay on the call. Wait for the green light.
On this page
TL;DR
DTMF masking intercepts the keypad tones a customer presses when entering card details over the phone, replacing them with a flat sound before they reach your agent or your call recording. Card data goes straight from the customer's handset to Paytia and on to your payment gateway, so it never touches your network. PCI DSS scope drops from SAQ D (329 controls) to SAQ A (22), and most of our customers go live in under a week.
Last updated: 27 May 2026
See the full interactive walkthrough →
Keys the card on their own phone
Real DTMF tones generated
Real tones
Captures the real tones, suppresses them, sends the card to your gateway
Card data stays here — never reaches you
Suppressed tones
Stays on the line, hears flat tones, picks up afterwards
No access to card data
Keys the card on their own phone
Captures the real tones, suppresses them, sends the card to your gateway
Stays on the line, hears flat tones, picks up afterwards

Every keypress on a phone produces a pair of audio frequencies — one low, one high. That's where the “dual-tone multi-frequency” name comes from. Press “5” and the handset emits 770 Hz alongside 1336 Hz. The low group runs 697, 770, 852 and 941 Hz; the high group runs 1209, 1336, 1477 and 1633 Hz. Every key on the pad maps to a unique pair, and the two-tone design exists because a single tone could be imitated by a human voice or background noise — a precise simultaneous pair can't be. Telephone networks have decoded those tones into digits since the 1960s, which is why a card number can travel over a phone call at all. It's also why the agent can hear it, why the call recorder captures it, and why a sticky note left by the desk three hours later is still a card-data leak waiting to be auditioned.
We sit in the audio path between the customer and your agent. When the agent decides it's time to take payment, they enter the amount in our agent terminal and press one key — usually 729 — to start the capture. The customer hears a short voice prompt asking them to enter their card on their handset. From that moment, every DTMF tone the customer generates gets caught by our platform, decoded into the actual digit, and sent over an encrypted channel to your payment gateway. The agent's leg of the call gets a flat replacement tone instead of the real one — a soft, identical-sounding chirp that tells them “something was pressed” but reveals nothing about which digit. The recording layer captures that same flat tone. So does any QA platform that's pulling audio from the trunk.
The voice channel stays open the whole time. If the customer asks “is this the right card?” or “what was the total again?” the agent answers normally — they just can't hear the digits being entered. The agent screen shows a masked indicator — **** **** **** 1234 once the full PAN, expiry and CVV are in — so they can see the capture is progressing without seeing the numbers themselves. When the gateway returns an authorisation, the agent sees the result on screen (paid, declined, or retry) and picks the conversation back up: “That's gone through, thanks — I'll email your receipt now.” The whole capture takes 20 to 30 seconds for most cards. There's no transfer to an IVR, no “please hold while I connect you to our payment system” — the call doesn't go anywhere.
The technical detail that matters here is wherethe interception happens. A naive implementation that detects tones and then replaces them can let a fraction of the first tone — a “bleed” — escape into the agent path before the replacement kicks in. The PCI Security Standards Council's telephone-payments guidance flags DTMF bleed specifically. Our implementation clamps the audio stream at both ends so that no tone can leak under any timing condition, which is the difference between a control that passes audit-day and one that holds up when a QSA actually starts pulling spectrograms.
A couple of details people ask about. The card number never sits in our systems beyond the few hundred milliseconds it takes to forward it to your gateway — we're a transmission point, not a vault. If you want to take repeat payments from the same customer later, the gateway tokenises the card and gives you a token to keep; the original PANstill never lives in your environment. And because we're intercepting at the audio layer rather than relying on agent behaviour, there's no “the agent forgot to start it” failure mode. If the agent skipped the capture and asked the customer to read the digits aloud the old way, you'd hear that on your recording — but the moment they press 729, masking is on, and there's nothing for the agent to remember to do correctly.
PCI DSS scope is determined by where cardholder data flows. Anywhere card data is stored, processed, or transmitted is in scope, and every system in scope has to meet the standard's controls. Without DTMF masking, a typical contact-centre payment puts a lot of systems in the data path: the agent's headset and workstation, the agent's desktop applications, the call recording server, the QA platform that ingests recordings, the WFM tools that touch them, the network segments connecting all of those, the firewalls in front, and the directory services authenticating the agents. That's the SAQ D scenario — 329 controls covering everything from network segmentation to vulnerability scans to formal incident-response plans, audited annually.
With masking on, card data never reaches any of those systems. The customer's handset talks to our PCI DSS Level 1 platform, and we talk to your gateway. That's the cardholder data environment, and you're not in it. The recording stops being in scope because there are no card digits in the audio. The agent workstation stops being in scope because no card data reaches it. Your CRM stops being in scope because the agent never has digits to paste. Most of your network drops out for the same reason. You move to SAQ A, which is 22 controls — most of them about picking a certified service provider, validating their attestation, and keeping your own organisational policies in order. It's the same SAQ used by businesses that take payment only via a hosted iFrame on their website. The PCI Security Standards Council's information supplement Protecting Telephone-Based Payment Card Datarecognises DTMF masking as a method that takes the telephony environment, the agent environment, and the CRM out of scope, and that's the entire commercial logic of the control.
There's a quieter version of the same risk that organisations underweight: the rogue agent. Most contact-centre fraud doesn't come from external attackers — it comes from someone on the inside who can hear or see card digits and chooses to write them down. Once the data is in the agent's environment, no audit log will tell you a number went into a notebook. The only reliable defence is making sure the data never reaches the agent in the first place, which is exactly what masking does. You're not asking your staff to behave well around card data — you're making it impossible for them to misbehave because the data isn't there.
A few things stay in scope and it's worth being honest about which. You're still responsible for the systems that handle the resultof the payment — order numbers, refund records, the customer record where you log “paid £49.99 on 12 March.” That's not card data, but if you store the auth code, last 4 digits, or anything else returned from the gateway, those records still need basic protection. You're also still responsible for vetting Paytia: pulling our Attestation of Compliance every year, checking we're still on the PCI Council's registered service-provider list, and including the validation in your own SAQ A submission. That's real work but it's an afternoon, not a quarter.
The numbers people care about: most of our customers report PCI compliance costs falling 80–90% in the first year, the audit itself going from a multi-week QSA engagement to a self-assessment, and recording retention becoming a non-issue. There's also a category of cost that's harder to quantify — the agent training, the “don't write that down” posters, the periodic spot-checks of the contact-centre floor — that simply stops being needed. If there's no card data for an agent to mishandle, you don't need a programme to stop them mishandling it.
PCI DSS 4.0.1 has been mandatory since 31 March 2025. Two of the changes hit telephone payments directly. The first is the treatment of call recording: any recording that captures sensitive authentication data after authorisation completes is now an explicit control failure. Pause-and-resume was already a weak control under earlier versions — under 4.0.1, an environment that depends on it has very little rope when an auditor finds the inevitable handful of recordings where the pause didn't fire. The reliable answer is to ensure the data isn't in the audio at all, which is what DTMF masking does at an architectural level rather than a behavioural one.
The second is segmentation. 4.0.1 expects clearer documented separation between in-scope and out-of-scope systems, and a properly descoped contact centre is structurally easier to defend in this respect — card data never crosses the boundary, so the boundary is straightforward to evidence. UK contact centres also have to square this with the FCA's SYSC 10A taping rules, which require recording of regulated client calls. The two obligations collide every time a customer reads digits aloud or presses them on an unmasked keypad. Masking is the architecture that lets you satisfy both — your recording stays complete and continuous, with the digits simply absent from the audio.
The compliance argument is the one that wins budget. The customer-experience argument is the one that turns a payment line from a friction point into a revenue channel. Both matter, and one of them is harder to measure than the other.
Customers know when an agent can hear them entering card details. They feel it. We've had customers tell us they used to go outside, close their office door, or wait until they got home before making a payment call — because reading their card aloud or hearing the tones echo across the line in a quiet office felt like undressing in public. Some would avoid the phone payment entirely and ask for an invoice, adding days or weeks to the cycle while you chased a payment that should have closed on the call. That friction translates directly into lost or delayed revenue, and it's invisible if you only look at conversion at the call level — the customer didn't abandon, they just asked to pay later, and a third of them never come back to it.
Open-plan offices are where this bites hardest. A finance manager paying a supplier from a shared workspace doesn't want everyone within three desks to hear the keypad tones — even if the actual digits can't be reconstructed by ear, the discomfort is enough to make them stop the call and pay another way. Public spaces present the same problem, sharper. A customer on a train platform, in a hospital waiting room, or sitting in a coffee shop doesn't have the option of finding a private room. With masking on, they enter their card silently while the agent stays on the line to confirm the amount and answer questions — and the awkwardness simply isn't in the call any more.
Remote and hybrid working sharpened the same edge from the agent's side. Since 2020, a meaningful proportion of contact-centre agents take payments from spare bedrooms and kitchen tables, with partners, children and housemates in earshot. Without masking, you're trusting every one of those agents to maintain physical security in a home they don't control. With masking, card data never reaches the agent's location regardless of where they happen to be sitting — the privacy protection is consistent across head-office desks, home studies and co-working spaces. Agents tell us they find this easier too: they're no longer worrying about whether they accidentally wrote down a digit, whether the recording captured something it shouldn't have, or whether they'll be blamed if a customer's card is later compromised. That reduction in anxiety has a real effect on staff retention in a sector where turnover is already painful.
If you want a single metric to track, look at completion rate at the point of payment — the percentage of calls where payment is offered against the percentage where it's actually captured before the customer hangs up. Most customers we work with see this lift after masking goes live. The other useful number is time-to-payment: the days between a service being agreed and the cash arriving. If that drops from weeks to zero because the payment closes on the call, the privacy advantage is doing what it's supposed to.
We don't rip and replace. You keep your CCaaS, your gateway, your CRM, your QA platform, your headsets. We slot into the audio path and change one thing — what reaches the agent during the seconds the customer is keying digits. Everything else carries on as before.
On modern CCaaS — Genesys, Five9, Amazon Connect, NICE CXone, 8x8, RingCentral, Talkdesk, 3CX — we connect via API or SIP and the integration is a few hours of work on our side and a sign-off on yours. Traditional PBXes (Mitel, Avaya, Cisco) and plain SIP trunks are the same shape with slightly more co-ordination on routing. Your gateway can be Stripe, Worldpay, Adyen, Opayo, NMI, Trust Payments, Braintree, Cybersource or any of the long tail — if it has an API, we talk to it. We support seven integration patterns in total: network-level divert, PBX-level divert, IVR menu, agent transfer or conference, PSTN outbound via Paytia, Paytia WebRTC, and direct SIP trunk integration. Between them, that covers almost every contact-centre telephony stack we've walked into.
For the agent, the behaviour change is genuinely small — enter the amount, press 729, watch a progress indicator until you see the green light, carry on the conversation. A 15-minute team huddle covers it for most teams. The bigger piece of work is usually social: deciding how the agent introduces the capture (“I'm going to take your card details now — when you're ready, please enter your long card number on your phone keypad and I'll see when you're done”), making sure the script feels natural, and getting team leaders comfortable with the new audit trail. Edge cases that occasionally surface: old softphones with off-frequency DTMF (five-minute tolerance widen on our side), poor mobile signal degrading detection (the retry prompt handles it cleanly), a SIP carrier converting tones to RFC 2833 events (different handler, same outcome), or a recorder forking the stream at the SBC layer rather than the application layer (we just need to confirm masking sits upstream). None of these are projects.
If you want to see how this lands in your specific environment, book a 15-minute demo and we'll map it against your stack. If you'd rather see it side-by-side against the alternative architecture, channel separation takes a different route to the same compliance outcome — the comparison page walks through where each one lands in practice.
The bit that decides whether masking works in practice is the audio interception layer. A handset emits the real DTMF pair the moment a key is pressed; somewhere between that handset and the agent's earpiece, the tones have to be detected, decoded, and replaced with a flat sound, and that has to happen fast enough that no fragment of the original tone slips through to the agent. The audio engineering matters because the failure mode isn't “a tone got through and the agent heard a clean beep” — it's “ten or twenty milliseconds of the original tone bled through before the replacement kicked in, and a determined attacker with a recording and an FFT can reconstruct the digit.” A PCI assessor pulling spectrograms is doing exactly that test.
We sit inline on the media stream. The customer audio arrives as RTP packets; we run continuous detection against the live waveform rather than waiting for a digit-completed event from the carrier, and we replace the sample buffer in the same packet before it's forwarded to the agent leg. The replacement waveform is a soft chirp at a fixed frequency, deliberately chosen to fall outside the DTMF grid so it can't be mistaken for any keypad digit by a recorder, an analytics tool, or a person re-listening on a phone speaker. The agent hears the same chirp for every key pressed, which carries no information beyond “a key was pressed.”
Latency matters because the customer is still talking to the agent through the same call. If the audio path picks up any added delay, the conversation feels half a second late and agents complain. Our interception adds typically under 20 milliseconds of one-way latency — well below the 150-millisecond ITU-T G.114 threshold for “essentially transparent” voice quality, and not audible against the normal jitter on a wide-area call. The customer keeps talking, the agent keeps responding, and the only behavioural change either of them notices is the flat chirp during the digit-entry window.
Carriers occasionally signal DTMF as out-of-band events rather than in-band audio tones — the RFC 2833 (now RFC 4733) telephone-event payload type encodes each keypress as a separate RTP packet outside the audio stream. We handle both. In-band tones get the audio-substitution treatment described above. Out-of-band events get dropped from the forwarded stream and decoded directly to the digit on our side. From a compliance standpoint the outcome is identical — no card data reaches the agent leg — and from an engineering standpoint it means we don't care whether your carrier is on the old analogue model or a modern SIP trunkingsetup that prefers RFC 4733.
Masking and channel separationare two routes to the same compliance destination — both keep card data out of your business, both drop you to SAQ A, both work with the same telephony stacks. They differ on what happens during the 20-30 seconds the customer is actually entering digits. With masking the agent stays on the live audio throughout, hears the flat replacement chirp, and is free to talk the customer through the capture (“take your time, no rush” / “you should see a confirmation on your screen too”). With channel separation the agent's audio path goes off-line for the duration; voice prompts on our platform drive the customer through the entry; and the agent comes back on the line once the gateway returns an authorisation.
Pick masking when the call needs the agent present through the payment step. Outbound retention calls where the conversation matters more than the transaction. Insurance renewals where the customer often pauses mid-card-entry to ask about cover. Charity fundraising where the agent is the relationship and a silent stretch is a missed opportunity. Anything where the customer relationship runs through the same agent on the same call.
Pick channel separation when audit posture is the priority and conversational engagement isn't. Some compliance teams want a hard physical separation they can point at on an architecture diagram — no possibility of agent involvement during capture, by design. High-volume transactional contact centres where calls are short and the agent has no role beyond confirming the amount often run channel separation because it removes the agent step entirely. It's also the choice when your agents simply don't want to be in the loop — there's nothing to remember, nothing to press, no “did I do it right” anxiety; the platform takes over and hands the call back when it's done.
Two practical notes. First, you can run both for different call types on the same Paytia account — masking on the outbound sales line, channel separation on the inbound bill-payment line, same back-end, same gateway connection. Second, switching between them later is a configuration change, not a re-implementation, so the choice you make on day one isn't locked in for life. The full side-by-side comparison walks through agent-experience, audit-posture and rollout differences in more detail.
The shape of the integration depends on what your telephony looks like today. We've walked into a lot of contact-centre stacks since 2016, and almost every one falls into one of four patterns. Knowing which one you're in tells you what we'll need from your side and roughly how long the implementation runs.
Modern cloud CCaaS — Genesys Cloud, Five9, Amazon Connect, NICE CXone, 8x8, RingCentral, Talkdesk. These platforms expose APIs for call routing, agent state, and screen-pop, and we plug into those. From your side you'll need to enable our integration in your CCaaS admin console, create the routing rule that hands the payment leg over to us when the agent triggers it, and configure the screen-pop URL so our agent terminal opens on top of your existing softphone. Typical implementation runs two to four working days, mostly waiting for change-window approvals on your side. The Aircall integration walk-through is one worked example; the same architecture applies across the others.
On-prem PBX— Mitel, Avaya, Cisco UCM, Asterisk, FreeSWITCH, 3CX on dedicated hardware. The integration here is SIP-based and a touch more co-ordination is needed on call routing, but the outcome is identical. We'll need access to your SIP trunk configuration (or a willingness to add a new trunk pointing at us), a routing rule on the PBX that diverts the call leg to us when the agent dials the payment short code, and visibility into how your recorder is hooked in so we can confirm masking sits upstream of it. Implementation typically runs five to ten working days, again mostly bounded by change windows on your side rather than work on ours.
Plain SIP trunk — no CCaaS, no PBX, just a SIP carrier feeding agent softphones directly. Probably the simplest case: we provision a SIP endpoint on our side, you point the relevant DID or trunk leg at us during the payment window, and we hand the call back when the gateway returns. Implementation can be live in two to three days for this pattern.
Hybrid or legacy — analogue lines, an old PBX behind a session border controller (SBC) feeding a cloud CCaaS, or anything that involves a media gateway converting between protocols. These take a discovery call to map out, but they're almost always solvable — usually with one of our seven integration patterns (network-level divert, PBX-level divert, IVR menu, agent transfer or conference, PSTN outbound via Paytia, WebRTC, or direct SIP trunk integration). Implementation runs one to three weeks depending on how many systems are in the chain.
On the recording side we need to know one thing: where in the audio path does the recorder fork the stream? If it's at the application layer (recorder sits on the agent workstation, captures whatever the agent hears), masking is automatically clean because the agent never hears the digits. If it's at the network layer or the SBC, we just need to confirm our interception sits upstream of the fork — which it does in every pattern we deploy. If you're not sure where your recorder forks, your CCaaS vendor or your network engineer will know in five minutes.
On the gateway side we need API credentials and a sandbox account to test against before going live. We connect to every major UK and US acquirer — Stripe (where we're a partner), Worldpay, Adyen, Barclaycard, Tyl by NatWest, Ryft, Elavon — plus the long tail of regional and specialist gateways. If your gateway has an API we've almost certainly seen it; if it's genuinely new to us we'll build the connector and the additional engineering time is usually a week or two.
What you don't need: new hardware on the agent desk, a separate dial-out line per seat, a procurement cycle for replacement headsets, or any change to your CRM. We don't install software on agent workstations — our terminal runs in a browser tab. We don't touch your acquirer relationship — your merchant ID stays where it is, the money still settles to your bank account, the only difference is the card data takes a different route to the gateway.
The basic shape of a phone payment doesn't change much across industries — a customer reads out an amount, the agent triggers the capture, the customer keys their card, the gateway returns an authorisation — but the surrounding pressures vary, and they shape how masking gets used.
Insurance.Premium collection on renewal, mid-term adjustments, instalment catch-ups, excess payments at claim notification. The renewal call is the single biggest case: the customer rings to ask about cover changes, the agent walks them through options, agrees a price, and needs to take the card on the same call without losing the conversation. Masking keeps the agent in dialogue (“you'll see this come out on the 28th alongside your direct debit”) through the capture. Several insurance customers tell us their save-the-sale rate at renewal went up materially once they stopped having to break the conversation for a transfer to an IVR. FCA-aligned data handling matters here too — card data never on the contact centre's systems means nothing to evidence under SYSC 10A for card capture, and recording stays continuous as required.
Utilities. Bill payments, arrears recovery, recurring direct debit set-up, instalment plans, smart-meter top-ups. High volume, short calls, and the agent often has to confirm meter readings or the schedule for ongoing payments alongside the transaction. Masking suits this because the agent does the confirmation conversation, presses 729 for capture, then confirms the recurring set-up while the gateway is still processing. The single-channel design means no second leg to orchestrate, which keeps average handle time down on calls that already run to tight SLAs.
Financial services and banking.Loan repayments, credit-card payments, overdraft clearance, mortgage instalments. The compliance bar is higher because of FCA Consumer Duty and the firms' own internal audit requirements, but the call shape is similar to utilities — confirm identity, confirm amount, take the card, send the receipt. The harder requirement in financial services is usually around evidencing that the customer authorised the specific amount on the specific date; the recording covers that automatically, with the digits absent from the audio but the agreement to pay fully captured.
Healthcare and clinical services.Private prescriptions, consultation fees, treatment plans, deposit payments on procedures. Patients are often distressed when they ring — taking the card needs to feel calm and non-extractive, which is hard if the alternative is reading digits aloud over a call. Masking lets the agent stay supportive throughout the payment step (“take your time, there's no rush”) without having to break the conversation for a transfer. Cyber Essentials Plus and the Data Protection Act add audit pressure here too, and SAQ A is materially less work to maintain than SAQ D in environments where IG audit cycles are already heavy.
Charity and fundraising.One-off donations, recurring gift set-up, sponsorship pledges, gift-aid declarations. The donor relationship matters more than the transaction efficiency, and silence during card entry is the moment donors second-guess themselves. Masking keeps the fundraiser in conversation (“your gift will go to the meal programme — can I tell you what that funds?”) through the capture window. Trinity Hall College at Cambridge runs masking on alumni fundraising calls and reported the conversational continuity made a measurable difference to gift completion.
Education and professional bodies.Course fees, exam fees, membership renewals, certification renewals. Often these are scheduled outbound calls (“your membership lapses next month — shall we renew now?”) where the agent has reached the customer at an awkward moment and has thirty seconds to close. The masking workflow takes nothing away from those thirty seconds — agent stays on the line, customer keys the card, both parties stay in the same conversation throughout.
Legal services and professional firms.Retainer payments, fee instalments, settlement payments. Smaller volumes than a contact centre, but the privacy concern is sharper because the client is often discussing the matter for which they're paying. Masking removes the awkwardness of having to read a card number aloud in a conversation that has just covered sensitive personal circumstances.
If you're moving from a card-data-in-scope environment to a masked architecture, your SAQchanges and so does the conversation with your assessor. Most QSAs have seen masking deployments by now and know how to scope them, but it's still worth running these questions past yours before signing anything. The answers will tell you whether your assessor is calibrated for this kind of architecture or whether you'll be educating them as you go.
Do you confirm that with DTMF masking deployed, our contact centre infrastructure is out of scope for SAQ D and falls under SAQ A? The PCI Council's information supplement Protecting Telephone-Based Payment Card Data already says yes, but you want your QSA to commit to it in writing for your specific environment before you make architectural decisions on the strength of it. If they hedge, ask what evidence they'd need to see — usually it's our Attestation of Compliance plus a network diagram showing the masking platform sitting upstream of the recorder and the agent workstation.
How do you want us to evidence that masking is actually working?This is the question that surprises people. The QSA isn't going to take it on faith. They'll typically want a sample test where they listen to a recording from a real masked call and confirm the digits aren't in the audio. They may also want to see logs from our platform showing the masking event fired on a specific call ID. We have a standard evidence pack for this — ask your QSA which parts of it they want and we'll provide them.
What's our position on the recording itself — is it still in scope, even if there's no card data in it? Most QSAs say no: if the recording demonstrably contains no PAN and no cardholder data, it's not in the cardholder data environment. A small minority will argue the recording stays in scope as a “supporting system”. The answer matters because it determines whether you need to apply PCI DSS controls to your recording storage and access logs. Get this in writing.
How do you treat our agent workstations if they never receive card data?Same question, different system. The standard position is that an agent workstation that never receives card data isn't in scope — which is the whole point of deploying masking. Some QSAs will treat them as a “connected-to” system because they're on the same network as the masking platform. Ours is upstream of yours, not on your LAN, so this rarely lands — but check.
What's your view on agent-assisted phone payments under PCI DSS 4.0.1?Open question, but a good one. A QSA who's thought about it will give you a clear view on pause-and-resume (going away), tokenisation handling (still required where you keep an authorisation token), and call-recording controls (continuous recording is fine if the audio is masked). If they hedge on all three, you're probably the first masked customer they've scoped — not necessarily a problem, but you'll need to bring them up the curve.
Our pre-sales team has walked through these questions with most of the major UK and US assessors. If you want a briefing pack to send to your QSA ahead of your scoping call, we can supply one — drop us a lineand we'll send it across.
Two ways to do the same job. Both keep card data out of your business and drop you to SAQ A. The difference is what the agent does during card capture. See the full side-by-side.
Single channel. Agent stays on the line. Tones are masked in the live audio so the agent doesn't hear the digits.
Pick this if your agents handle complex calls and need to stay engaged through the payment step. Conversational throughout.
Two channels. Agent's audio goes off-line during capture. Voice prompts run the flow on the customer leg.
Pick this ifyour compliance team wants a hard physical separation for audit, or if you'd rather agents had no involvement in the capture step at all.
Read about Channel Separation →“Paytia turned a security exposure and reputational risk into a value-enhancing opportunity. Fundraising has never been more important and Paytia has helped us achieve our goals.”
Cambridge University
Trusted by British American Tobacco · Howard Kennedy · CITB · Clinical Partners · Trinity Hall College

The conversation doesn't break. Your agent can talk the customer through the capture, answer questions, and pick up the call as soon as the payment authorises.
We mask the tones before they hit the recording layer, so there's no card data in the audio. No pause-and-resume, no redaction, no compliance exposure when a recording is pulled from archive.
Any modern telephony — Genesys, Five9, Amazon Connect, NICE, 8x8, Talkdesk, RingCentral, 3CX, or a plain SIP trunk. , no per-seat hardware.
Agents press one key to start a capture and watch a progress indicator. There's no script and no procedure to learn. Roll-out is days.


Paytia carries the highest level of PCI certification, so your scope drops the moment you connect. For the full breakdown of what changed and what counts as compliant in 2026, read the PCI DSS v4.0.1 buyer's guide.
| Area | Without Paytia | With Paytia |
|---|---|---|
| Self-assessment | SAQ D (329 controls) | SAQ A (22 controls) |
| Network in scope | Most of your stack | None |
| Call recordings | Pause-and-resume or redact | No restrictions |
| Staff training | Mandatory and recurring | None required |

If you take card payments over the phone and want the agent engaged through the payment step, this fits.
Agents stay engaged through the payment step — useful for upsell, retention, or any conversation where the call doesn't naturally pause.
See contact centre PCI compliance →
Premiums, excesses, repayments, top-ups — taken on the phone with the agent still able to talk the customer through.
High-volume bill payments and recurring set-ups where the agent needs to confirm the account, the amount, and the schedule on the same call.
Donations and recurring gifts captured live during fundraising calls without the donor reading their card aloud.
When a customer types card details on their phone keypad, every keypress generates a DTMF tone in the audio. DTMF masking replaces those tones with a flat sound in real time, before they reach your agent or your call recording. The card data goes straight from the customer's handset to Paytia and on to your payment gateway. You'll also hear it called DTMF suppression — it's the same thing.
Nothing — they're two names for the same technology. Vendors differ on which one they use in their marketing. We used to call it DTMF suppression ourselves; most of our customers search for DTMF masking, so that's what we lead with now. Both describe the same thing: intercepting the keypad tones in real time so they never reach your agent's audio or your call recording.
Both keep card data out of your business and drop you to SAQ A. The difference is what the agent does. With DTMF masking the agent stays on the live audio throughout — they can talk the customer through the capture and pick up the conversation immediately afterwards. With Channel Separation the agent's audio path goes off-line during capture and voice prompts run the flow. Pick DTMF masking if you want the agent engaged through the payment step.
Yes — modern CCaaS platforms (Genesys, Five9, Amazon Connect, NICE CXone, 8x8, RingCentral, Talkdesk), traditional PBX, and plain SIP/VoIP trunks. Integration is via API or SIP. Most setups are live within a week.
Card data never enters your network, your agents, or your call recording. Most businesses move from SAQ D (329 controls) to SAQ A (22 controls). The recording system stops being in scope because there's no card data in it to begin with.
A little — there's a one-click action per call. The agent enters the amount, presses one key to start the capture, then watches a progress indicator on screen until the payment authorises. That's the whole behaviour change; most teams pick it up inside a single shift. If you want zero agent training, Channel Separation is the variant to look at — the platform drives the capture automatically, and the agent does nothing during the payment step.
Yes. It's built for card-not-present transactions over the phone — agent-assisted sales, mail-order, telephone-order, anywhere a customer would otherwise read card details over a call.
The customer can press * to clear and start the card number again. If they enter a card number that fails the Luhn check (the mathematical check digit that every valid card carries), our platform prompts them to re-enter without involving the agent — they just hear a short voice prompt asking them to try again. The agent sees the retry on their screen but no digits. If the customer needs help, they can speak to the agent at any point during the entry; the conversation channel stays open throughout.
Both. The DTMF tones a mobile generates are the same audio frequencies as a desk handset — the underlying telephony standard hasn't changed since the 1960s. We see the same detection accuracy across iPhones, Android handsets, desk phones and softphones. The only edge case is some Bluetooth headsets that compress the audio aggressively, which can degrade tone clarity; we widen our detection tolerance to compensate and almost never see a failed capture as a result.
Under 20 milliseconds of added one-way latency on the audio path during the capture window. That's well below the 150-millisecond ITU-T threshold for what's considered transparent voice quality, and not audible against the normal jitter on a typical wide-area call. Customers and agents don't notice the difference in the conversation — they just notice the flat chirp replacing the keypad tones during the seconds the card is being entered.
It keeps running. There's no pause-and-resume, no gap in the recording, no silent section that has to be explained later. The recording captures the customer's voice, the agent's voice, and the flat replacement chirps from the keypad — no card digits because there aren't any in the audio. That continuous recording also satisfies obligations like the FCA's SYSC 10A taping rules, where regulated calls have to be recorded in full.
Most modern cloud CCaaS deployments are live in two to four working days. On-prem PBX integrations typically run five to ten working days. Plain SIP trunk setups can be live in two to three days. The bottleneck is almost always change-window approvals on your side rather than work on ours — once we have access to the routing config and a sandbox account on your gateway, the technical implementation is straightforward.
We'll set up a demo against the same phone system and gateway you already run. Most businesses are taking live payments within a week.
Trusted by law firms, insurers, healthcare providers and regulated businesses worldwide. Learn more about Paytia
Other ways to take payments in this channel.
Take Mail Order / Telephone Order payments without the card number reaching your agents, your recording, or your systems.
Learn moreYour agent stays on the live call while the customer keys their card. We mask the tones so no card data reaches the recording or the agent's audio.
Learn moreHow to take card payments on a call legally, securely, and without landing in SAQ D. Covers agent-assisted, IVR, and outbound options.
Learn more