PCI DSS Level 1 Certified

DTMF masking — silence card tones, stay PCI compliant

The customer types their card on their phone keypad while the agent stays on the line. We mask the DTMF tones in real time, so the agent hears nothing identifiable, the recording stays clean, and the card data goes straight to Paytia. You'll also hear it called DTMF suppression — it's the same thing. We've been doing it since 2016. PCI scope drops from SAQ D to SAQ A.
1

Customer

Keys the card on their own phone

2

Paytia platform

Captures the real tones, suppresses them, sends the card to your gateway

3

Agent

Stays on the line, hears flat tones, picks up afterwards

How DTMF masking actually works on a live call

Every keypress on a phone produces a pair of audio frequencies — one low, one high. That's where the “dual-tone multi-frequency” name comes from. Press “5” and the handset emits 770 Hz alongside 1336 Hz. The low group runs 697, 770, 852 and 941 Hz; the high group runs 1209, 1336, 1477 and 1633 Hz. Every key on the pad maps to a unique pair, and the two-tone design exists because a single tone could be imitated by a human voice or background noise — a precise simultaneous pair can't be. Telephone networks have decoded those tones into digits since the 1960s, which is why a card number can travel over a phone call at all. It's also why the agent can hear it, why the call recorder captures it, and why a sticky note left by the desk three hours later is still a card-data leak waiting to be auditioned.

We sit in the audio path between the customer and your agent. When the agent decides it's time to take payment, they enter the amount in our agent terminal and press one key — usually 729 — to start the capture. The customer hears a short voice prompt asking them to enter their card on their handset. From that moment, every DTMF tone the customer generates gets caught by our platform, decoded into the actual digit, and sent over an encrypted channel to your payment gateway. The agent's leg of the call gets a flat replacement tone instead of the real one — a soft, identical-sounding chirp that tells them “something was pressed” but reveals nothing about which digit. The recording layer captures that same flat tone. So does any QA platform that's pulling audio from the trunk.

The voice channel stays open the whole time. If the customer asks “is this the right card?” or “what was the total again?” the agent answers normally — they just can't hear the digits being entered. The agent screen shows a masked indicator — **** **** **** 1234 once the full PAN, expiry and CVV are in — so they can see the capture is progressing without seeing the numbers themselves. When the gateway returns an authorisation, the agent sees the result on screen (paid, declined, or retry) and picks the conversation back up: “That's gone through, thanks — I'll email your receipt now.” The whole capture takes 20 to 30 seconds for most cards. There's no transfer to an IVR, no “please hold while I connect you to our payment system” — the call doesn't go anywhere.

The technical detail that matters here is wherethe interception happens. A naive implementation that detects tones and then replaces them can let a fraction of the first tone — a “bleed” — escape into the agent path before the replacement kicks in. The PCI Security Standards Council's telephone-payments guidance flags DTMF bleed specifically. Our implementation clamps the audio stream at both ends so that no tone can leak under any timing condition, which is the difference between a control that passes audit-day and one that holds up when a QSA actually starts pulling spectrograms.

A couple of details people ask about. The card number never sits in our systems beyond the few hundred milliseconds it takes to forward it to your gateway — we're a transmission point, not a vault. If you want to take repeat payments from the same customer later, the gateway tokenises the card and gives you a token to keep; the original PANstill never lives in your environment. And because we're intercepting at the audio layer rather than relying on agent behaviour, there's no “the agent forgot to start it” failure mode. If the agent skipped the capture and asked the customer to read the digits aloud the old way, you'd hear that on your recording — but the moment they press 729, masking is on, and there's nothing for the agent to remember to do correctly.

What this does to your PCI DSS scope

PCI DSS scope is determined by where cardholder data flows. Anywhere card data is stored, processed, or transmitted is in scope, and every system in scope has to meet the standard's controls. Without DTMF masking, a typical contact-centre payment puts a lot of systems in the data path: the agent's headset and workstation, the agent's desktop applications, the call recording server, the QA platform that ingests recordings, the WFM tools that touch them, the network segments connecting all of those, the firewalls in front, and the directory services authenticating the agents. That's the SAQ D scenario — 329 controls covering everything from network segmentation to vulnerability scans to formal incident-response plans, audited annually.

With masking on, card data never reaches any of those systems. The customer's handset talks to our PCI DSS Level 1 platform, and we talk to your gateway. That's the cardholder data environment, and you're not in it. The recording stops being in scope because there are no card digits in the audio. The agent workstation stops being in scope because no card data reaches it. Your CRM stops being in scope because the agent never has digits to paste. Most of your network drops out for the same reason. You move to SAQ A, which is 22 controls — most of them about picking a certified service provider, validating their attestation, and keeping your own organisational policies in order. It's the same SAQ used by businesses that take payment only via a hosted iFrame on their website. The PCI Security Standards Council's information supplement Protecting Telephone-Based Payment Card Datarecognises DTMF masking as a method that takes the telephony environment, the agent environment, and the CRM out of scope, and that's the entire commercial logic of the control.

There's a quieter version of the same risk that organisations underweight: the rogue agent. Most contact-centre fraud doesn't come from external attackers — it comes from someone on the inside who can hear or see card digits and chooses to write them down. Once the data is in the agent's environment, no audit log will tell you a number went into a notebook. The only reliable defence is making sure the data never reaches the agent in the first place, which is exactly what masking does. You're not asking your staff to behave well around card data — you're making it impossible for them to misbehave because the data isn't there.

A few things stay in scope and it's worth being honest about which. You're still responsible for the systems that handle the resultof the payment — order numbers, refund records, the customer record where you log “paid £49.99 on 12 March.” That's not card data, but if you store the auth code, last 4 digits, or anything else returned from the gateway, those records still need basic protection. You're also still responsible for vetting Paytia: pulling our Attestation of Compliance every year, checking we're still on the PCI Council's registered service-provider list, and including the validation in your own SAQ A submission. That's real work but it's an afternoon, not a quarter.

The numbers people care about: most of our customers report PCI compliance costs falling 80–90% in the first year, the audit itself going from a multi-week QSA engagement to a self-assessment, and recording retention becoming a non-issue. There's also a category of cost that's harder to quantify — the agent training, the “don't write that down” posters, the periodic spot-checks of the contact-centre floor — that simply stops being needed. If there's no card data for an agent to mishandle, you don't need a programme to stop them mishandling it.

What changed under PCI DSS 4.0.1

PCI DSS 4.0.1 has been mandatory since 31 March 2025. Two of the changes hit telephone payments directly. The first is the treatment of call recording: any recording that captures sensitive authentication data after authorisation completes is now an explicit control failure. Pause-and-resume was already a weak control under earlier versions — under 4.0.1, an environment that depends on it has very little rope when an auditor finds the inevitable handful of recordings where the pause didn't fire. The reliable answer is to ensure the data isn't in the audio at all, which is what DTMF masking does at an architectural level rather than a behavioural one.

The second is segmentation. 4.0.1 expects clearer documented separation between in-scope and out-of-scope systems, and a properly descoped contact centre is structurally easier to defend in this respect — card data never crosses the boundary, so the boundary is straightforward to evidence. UK contact centres also have to square this with the FCA's SYSC 10A taping rules, which require recording of regulated client calls. The two obligations collide every time a customer reads digits aloud or presses them on an unmasked keypad. Masking is the architecture that lets you satisfy both — your recording stays complete and continuous, with the digits simply absent from the audio.

Why customers actually feel the difference

The compliance argument is the one that wins budget. The customer-experience argument is the one that turns a payment line from a friction point into a revenue channel. Both matter, and one of them is harder to measure than the other.

Customers know when an agent can hear them entering card details. They feel it. We've had customers tell us they used to go outside, close their office door, or wait until they got home before making a payment call — because reading their card aloud or hearing the tones echo across the line in a quiet office felt like undressing in public. Some would avoid the phone payment entirely and ask for an invoice, adding days or weeks to the cycle while you chased a payment that should have closed on the call. That friction translates directly into lost or delayed revenue, and it's invisible if you only look at conversion at the call level — the customer didn't abandon, they just asked to pay later, and a third of them never come back to it.

Open-plan offices are where this bites hardest. A finance manager paying a supplier from a shared workspace doesn't want everyone within three desks to hear the keypad tones — even if the actual digits can't be reconstructed by ear, the discomfort is enough to make them stop the call and pay another way. Public spaces present the same problem, sharper. A customer on a train platform, in a hospital waiting room, or sitting in a coffee shop doesn't have the option of finding a private room. With masking on, they enter their card silently while the agent stays on the line to confirm the amount and answer questions — and the awkwardness simply isn't in the call any more.

Remote and hybrid working sharpened the same edge from the agent's side. Since 2020, a meaningful proportion of contact-centre agents take payments from spare bedrooms and kitchen tables, with partners, children and housemates in earshot. Without masking, you're trusting every one of those agents to maintain physical security in a home they don't control. With masking, card data never reaches the agent's location regardless of where they happen to be sitting — the privacy protection is consistent across head-office desks, home studies and co-working spaces. Agents tell us they find this easier too: they're no longer worrying about whether they accidentally wrote down a digit, whether the recording captured something it shouldn't have, or whether they'll be blamed if a customer's card is later compromised. That reduction in anxiety has a real effect on staff retention in a sector where turnover is already painful.

If you want a single metric to track, look at completion rate at the point of payment — the percentage of calls where payment is offered against the percentage where it's actually captured before the customer hangs up. Most customers we work with see this lift after masking goes live. The other useful number is time-to-payment: the days between a service being agreed and the cash arriving. If that drops from weeks to zero because the payment closes on the call, the privacy advantage is doing what it's supposed to.

Where it fits in your stack

We don't rip and replace. You keep your CCaaS, your gateway, your CRM, your QA platform, your headsets. We slot into the audio path and change one thing — what reaches the agent during the seconds the customer is keying digits. Everything else carries on as before.

On modern CCaaS — Genesys, Five9, Amazon Connect, NICE CXone, 8x8, RingCentral, Talkdesk, 3CX — we connect via API or SIP and the integration is a few hours of work on our side and a sign-off on yours. Traditional PBXes (Mitel, Avaya, Cisco) and plain SIP trunks are the same shape with slightly more co-ordination on routing. Your gateway can be Stripe, Worldpay, Adyen, Opayo, NMI, Trust Payments, Braintree, Cybersource or any of the long tail — if it has an API, we talk to it. We support seven integration patterns in total: network-level divert, PBX-level divert, IVR menu, agent transfer or conference, PSTN outbound via Paytia, Paytia WebRTC, and direct SIP trunk integration. Between them, that covers almost every contact-centre telephony stack we've walked into.

For the agent, the behaviour change is genuinely small — enter the amount, press 729, watch a progress indicator until you see the green light, carry on the conversation. A 15-minute team huddle covers it for most teams. The bigger piece of work is usually social: deciding how the agent introduces the capture (“I'm going to take your card details now — when you're ready, please enter your long card number on your phone keypad and I'll see when you're done”), making sure the script feels natural, and getting team leaders comfortable with the new audit trail. Edge cases that occasionally surface: old softphones with off-frequency DTMF (five-minute tolerance widen on our side), poor mobile signal degrading detection (the retry prompt handles it cleanly), a SIP carrier converting tones to RFC 2833 events (different handler, same outcome), or a recorder forking the stream at the SBC layer rather than the application layer (we just need to confirm masking sits upstream). None of these are projects.

If you want to see how this lands in your specific environment, book a 15-minute demo and we'll map it against your stack. If you'd rather see it side-by-side against the alternative architecture, channel separation takes a different route to the same compliance outcome — the comparison page walks through where each one lands in practice.

DTMF Masking or Channel Separation?

Two ways to do the same job. Both keep card data out of your business and drop you to SAQ A. The difference is what the agent does during card capture. See the full side-by-side.

DTMF Masking

You're here

Single channel. Agent stays on the line. Tones are masked in the live audio so the agent doesn't hear the digits.

Pick this if your agents handle complex calls and need to stay engaged through the payment step. Conversational throughout.

Channel Separation

Two channels. Agent's audio goes off-line during capture. Voice prompts run the flow on the customer leg.

Pick this ifyour compliance team wants a hard physical separation for audit, or if you'd rather agents had no involvement in the capture step at all.

Read about Channel Separation →
“Paytia turned a security exposure and reputational risk into a value-enhancing opportunity. Fundraising has never been more important and Paytia has helped us achieve our goals.”

Trinity Hall College

Cambridge University

Trusted by British American Tobacco · Howard Kennedy · CITB · Clinical Partners · Trinity Hall College

What you get

Agent stays on the line

The conversation doesn't break. Your agent can talk the customer through the capture, answer questions, and pick up the call as soon as the payment authorises.

Recording stays clean

We mask the tones before they hit the recording layer, so there's no card data in the audio. No pause-and-resume, no redaction, no compliance exposure when a recording is pulled from archive.

Works with what you have

Any modern telephony — Genesys, Five9, Amazon Connect, NICE, 8x8, Talkdesk, RingCentral, 3CX, or a plain SIP trunk. , no per-seat hardware.

Live in days

Agents press one key to start a capture and watch a progress indicator. There's no script and no procedure to learn. Roll-out is days.

PCI DSS scope, before and after

PCI DSS Level 1 Service Provider certification badge

PCI DSS Level 1

Paytia carries the highest level of PCI certification, so your scope drops the moment you connect. For the full breakdown of what changed and what counts as compliant in 2026, read the PCI DSS v4.0.1 buyer's guide.

AreaWithout PaytiaWith Paytia
Self-assessmentSAQ D (329 controls)SAQ A (22 controls)
Network in scopeMost of your stackNone
Call recordingsPause-and-resume or redactNo restrictions
Staff trainingMandatory and recurringNone required

Who uses it

If you take card payments on a phone call and want the agent engaged through the payment step, this fits.

Contact centres

Agents stay engaged through the payment step — useful for upsell, retention, or any conversation where the call doesn't naturally pause.

  • Conversational throughout the capture
  • Works with any CCaaS
  • No secure-room build-out

See contact centre PCI compliance →

Financial services

Premiums, excesses, repayments, top-ups — taken on the phone with the agent still able to talk the customer through.

  • FCA-aligned data handling
  • Card data never on your network
  • Drops you to SAQ A

Utilities

High-volume bill payments and recurring set-ups where the agent needs to confirm the account, the amount, and the schedule on the same call.

  • Bill payments and arrears
  • Recurring payment set-up
  • Same flow at scale

Charities

Donations and recurring gifts captured live during fundraising calls without the donor reading their card aloud.

  • Live one-off donations
  • Recurring gift set-up
  • Donor card data never stored

Frequently asked questions

What is DTMF masking?

When a customer types card details on their phone keypad, every keypress generates a DTMF tone in the audio. DTMF masking replaces those tones with a flat sound in real time, before they reach your agent or your call recording. The card data goes straight from the customer's handset to Paytia and on to your payment gateway. You'll also hear it called DTMF suppression — it's the same thing.

What's the difference between DTMF masking and DTMF suppression?

Nothing — they're two names for the same technology. Vendors differ on which one they use in their marketing. We used to call it DTMF suppression ourselves; most of our customers search for DTMF masking, so that's what we lead with now. Both describe the same thing: intercepting the keypad tones in real time so they never reach your agent's audio or your call recording.

How is it different from Channel Separation?

Both keep card data out of your business and drop you to SAQ A. The difference is what the agent does. With DTMF masking the agent stays on the live audio throughout — they can talk the customer through the capture and pick up the conversation immediately afterwards. With Channel Separation the agent's audio path goes off-line during capture and voice prompts run the flow. Pick DTMF masking if you want the agent engaged through the payment step.

Does it work with my phone system?

Yes — modern CCaaS platforms (Genesys, Five9, Amazon Connect, NICE CXone, 8x8, RingCentral, Talkdesk), traditional PBX, and plain SIP/VoIP trunks. Integration is via API or SIP. Most setups are live within a week.

How does it reduce PCI DSS scope?

Card data never enters your network, your agents, or your call recording. Most businesses move from SAQ D (329 controls) to SAQ A (22 controls). The recording system stops being in scope because there's no card data in it to begin with.

Is agent training required?

A little — there's a one-click action per call. The agent enters the amount, presses one key to start the capture, then watches a progress indicator on screen until the payment authorises. That's the whole behaviour change; most teams pick it up inside a single shift. If you want zero agent training, Channel Separation is the variant to look at — the platform drives the capture automatically, and the agent does nothing during the payment step.

Can DTMF masking be used for MOTO payments?

Yes. It's built for card-not-present transactions over the phone — agent-assisted sales, mail-order, telephone-order, anywhere a customer would otherwise read card details over a call.

Comparing approaches

If you're still weighing the options, our side-by-side comparison of DTMF masking and channel separation walks through the trade-offs — agent involvement, audit posture, and what your team needs to learn — in plain English.

Want to see it on your telephony?

We'll set up a demo against the same phone system and gateway you already run. Most businesses are taking live payments within a week.

PCI DSS Level 1
Cyber Essentials Plus

Trusted by law firms, insurers, healthcare providers and regulated businesses worldwide. Learn more about Paytia

Related solutions

Other ways to take payments in this channel.