Software Consulting Tornado Icon Software Consulting Tornado Icon

Challenge/Response Systems

In a hostile environment, it can be useful to respond to a message from an untrusted source with a "Challenge" that requires a "Response" from that source before considering the source to be more trustworthy.

This is especially helpful in situations where the source of the message cannot be otherwise validated, yet where proper handling of the message requires some degree of trust that the claimed sender is in fact the real sender.

Challenge/Response (C/R) systems have been designed into some protocols from the ground up, and work fairly well in those environments.

For example, a TCP connection initiated by A to B requires a handshake in which B effectively challenges A by sending a packet addressed to A and not otherwise considering the connection to have been made until A responds in a fashion that reasonably assures B that A, and not some other entity claiming to be A, received that particular packet. In TCP, however, this is not really considered a C/R system per se, because it also serves, among other things, to mitigate the problems that can result from transmission errors (A's initial packet trying to connect to B might be garbled such that another entity receives that request, or B might have received a garbled address for A).

In SMTP-based email, however, C/R systems identify themselves as such. They are add-ons that I believe do more harm than good. Many others share that belief, and mailing-lists devoted to (or caught up in) how to cope with the problems posed by UBE, often called "spam", have featured longrunning debates about C/R systems.

Questions and Answers

Below I answer some questions about C/R as it pertains to email exchange.

Question: Are Challenges sent by Challenge/Response (C/R) systems spam?

My answer: Essentially, yes. But this ends up being a question of semantics: precisely what is meant by "spam", and what is it about spam that we don't like?

I use the word "spam" to mean unsolicited (usually bulk) mail that is primarily intended to promote (advertise) a product or a viewpoint.

Many people use it in roughly the same way. Some exclude mail promoting merely a "viewpoint", so mail promoting religion, a political candidate, or the latest urban legend isn't called "spam" by them. But people who dislike spam usually dislike viewpoint-pushing bulk mail (especially if they receive similar quantities of it for viewpoints in which they have no interest), so they call it "spam" once they receive too much of it.

Some "spam" doesn't seem to promote anything in particular. As with emails some of my domains recently received sent to a nonexistent user "iamjustsendingthisleter", they seem incapable of doing anything more than perhaps triggering a response to the alleged senders of those emails (which might be part of a scheme to learn more about targeted SMTP servers in order to support spammers).

For most people, the "problem" with spam isn't so much that it advertises a product. It's that spam arrives in huge quantities ("bulk"); is substantially insensitive to whether the recipient actually wants to receive it ("unsolicited"); there often is no reliable way to request it no longer be sent (or not sent in the first place); it might have a malicious payload ("vermin", which includes viruses, worms, and trojan horses); its "pushers" exploit the relatively low cost of sending versus the higher costs of receiving (including sifting through the deluge to find out what is worth reading), usually, but not always, because they are sufficiently funded to do so (hence most spam does advertise a product); it can crowd out desired mail by consuming resources needed to transmit mail; and spammers usually don't care as much as "real" users when the email they send doesn't get delivered in a timely fashion, or at all.

Because these problems also exist for mail that not everyone defines as "spam", the acronym "UBE" is often used to describe email that is widely seen as undesirable, in contrast to "solicited" mail, even solicited mail sent in bulk. (Sometimes the acronym "UBM" is used instead.) "UBE" is therefore something of a "value-neutral" description of email, making it more useful in some forums.

Challenges sent by C/R systems meet the criteria for why spam is unwanted nearly perfectly:

Therefore, it is not unreasonable to characterize Challenges as "spam", especially when using "spam" as roughly equivalent to the acronyms UBE or UBM or when the Challenges themselves contain advertising.

Question: Why can't people just accept C/R as a solution to the UBE problem? Can't they just filter out unwanted Challenges, the way they presumably filter out unwanted UBE?

My answer: They can filter out Challenges by hand, but Challenges add to their UBE without their necessarily having done anything to trigger their being sent.

With SMTP, there is no way to automatically discern Challenges from other forms of UBE, partly because SMTP does not provide a standard way to represent a Challenge, and especially because UBE can masquerade as legitimate Challenges.

(As a thought-experiment, consider what would happen if C/R was indeed the only "solution" to the problem of UBE deployed on the Internet. That is, nothing was ever filtered, except via a C/R system. Assuming the system was still being "attacked" by UBE, how would we filter out Challenges that would result, while also distinguishing legitimate Challenges from forged Challenges?)

Because Challenges usually contain some form of advertising, purveyors of C/R systems (including "free" 3rd-party filtering services as well as well-intentioned authors and distributors of free-software solutions) are less inclined to look for ways to make handling of Challenges (by their recipients) fully automated, since that would defeat the purpose of the advertising.

Moreover, problems people have with UBE generally fall into one or two categories: recipient (end-user) filtering and reduced availability of resources used by intermediaries other than the recipient.

Recipient filtering means wading through UBE in order to find desired mail. That is annoying, can be time-consuming, and can result in "false positives" (a user might accidentally delete, or otherwise flag as unwanted, an email that would be desired if the recipient wasn't having to cope with lots of spam). Challenges can be viewed as UBE, though they can in practice, as well as in theory, be automatically filtered in ways UBE can't.

However, resource utilization problems occur because systems that relay mail, back it up, archive it, and so on, have to treat unfiltered UBE the same as ultimately desired mail. So resource exhaustion due partly to UBE can delay mail. It can even prevent desired mail from reaching its recipients.

Therefore, any UBE that the recipient is able to easily and automatically filter does not cease to become a "problem" just because of that fact — there remains the other problem of the resources UBE consumes as it makes its way to its recipients (and their filters).

Since Challenges can themselves be UBE, are currently not automatically recognized as Challenges by intermediaries (or recipient MUAs), and cannot themselves be authenticated without themselves being Challenged, wide deployment of C/R would actually make the UBE problem worse.

Question: Can the problems with C/R be mitigated by improving or replacing SMTP?

My answer: For the most part, no.

There are ways to mitigate some of the problems with C/R, given a theoretical "clean slate" of an email system (with no installed base), perhaps even a clean slate for DNS. Some of these theoretical improvements could lead to corresponding extensions to SMTP.

In particular, the "malicious payload" problem can be solved by limiting untrusted content in Challenges to tightly controlled fields such as Message-ID and date/time. But that can lead to other problems for certain users, such as mobile users.

Generally, sending email would end up requiring not only that the sender deliver the email to the recipient, but also that the sender update a (possibly remote and temporarily unreachable) data base to indicate that the mail had been sent by the sender, in case the recipient queried that data base via a Challenge.

Another way to view this: in SMTP, a legitimate sender is required to run an SMTP server in order to receive bounces that might be sent when mail delivery fails beyond the initial acceptance of responsibility by an intermediary. In practice, most legitimate mail is likely to reach its target, so legitimate senders need not maintain a tight (low-latency, high-reliability) connection to their SMTP server in order to send mail.

With C/R, however, a legitimate sender must run a server at all times between sending any email and confirming that it was finally accepted, because a Challenge can be sent to that server in response to any email sent by that sender. Until the server has received the Challenge and either delivered it to the sender (so the sender can confirm with a Response) or confirmed that it has already received a notification from the sender that the Challenge deserves an appropriate Response, delivery of the message will be delayed.

That implies a much tighter connection, or coupling, between the sender and any server operating on behalf of the sender and thus designated to receive Challenges, when compared to traditional SMTP, in which the server is needed only to receive bounces (and legitimate responses to the messages sent by the sender).

Further, to be sufficiently reliable, a C/R system would presumably employ an underlying delivery system, such as SMTP, that repeatedly attempts to deliver Challenges when the intended recipient (a server acting on behalf of a potential sender) cannot be reached.

Accordingly, any potential legitimate sender must always run a server that acknowledges receipt of Challenges (without necessarily issuing the expected Responses).

Otherwise, undeliverable Challenges can exhaust resources needed by intermediate systems (as well as the original recipient that issued the Challenge) to exchange "real" mail, because they become "queued up" while waiting for their intended recipients to acknowledge receiving them. (I've seen evidence that spammers forge sender addresses that point to nonresponsive or nonexistent SMTP servers, resulting in huge backlogs of bounce messages awaiting successful or permanent failure of delivery to those servers until such time as they expire from the queue.)

For a sender's server to be able to positively determine whether to automatically Respond to a Challenge, it must have a constant connection to the sender (that is, to any device from which the sender might send email).

When that constant connection is unavailable — whenever the sender cannot be reached in order to confirm that a given message was sent by that sender — the server receiving a Challenge pertaining to that sender must either internally queue the query to retry it later, or temporarily reject delivery of the Challenge and so let the challenger and/or its intermediaries queue and resend it later. Either approach contributes to the resource-utilization problem.

A sufficiently "clean" C/R implementation strikes me, offhand, as little different from IM2000, which has long been pushed as a solution to the spam problem. Yet this proposal has yet to see wide deployment, despite the relative ease with which it is implemented, and for good reasons: legitimate senders don't want IM2000 because it requires them to be more tightly coupled to servers compared to SMTP; and legitimate receivers don't want IM2000 because they don't want to incur the comparatively high-latency overheads required to simply read each email, due to having to fetch it from a remote message-store server, yet they won't likely know whether it's spam without first fetching it.

So I believe C/R is fundamentally broken as a concept. But it works pretty well in practice, especially in certain special circumstances; hence its attraction and the many enthusiasts who support it.

Question: Why are you picking on C/R systems in particular? Do other anti-spam solutions also bother you?

My answer: C/R is a tempting target mainly because it has been so widely promoted as a solution to spam, and seemingly would "obviously work" for most users who initially adopt it; yet, upon further examination, it is revealed to be a "cure worse than the disease" if adopted widely enough to stop (or even significantly reduce) spam for everybody.

I have come to believe the Golden Rule, "Treat others as you would have them treat you", is as much of an engineering, or scientific, imperative as it is a religious one. Certainly it helps clarify the utility of any proposed architecture, design, or implementation to take an objective viewpoint and consider the effects on all parties — that is, to go beyond "well, it works for me".

C/R systems fail this test mainly because people who are unwilling to sift through their own in-box to find "desired" email, and thus justify their use of a system (such as C/R) that Challenges others to help do that for them, are unlikely to, themselves, be willing to sift through large amounts of Challenges should their email address be widely forged by spammers (as mine have been).

At a higher level, C/R systems represent a point (or a collection of points) along a continuum of potential ways to handle validating the source of email:

<- NoneSender Domain ExistsSPFSender Server ResponsiveSender CalloutC/RIM2000POP3/IMAP ->

Proceeding left to right, each approach to validation represents an escalation of a Challenge that is issued to the alleged sender of an email, as well as an escalation in the size and the specificity of the Response expected from that sender. In that sense, all of the systems (even "None") are Challenge/Response systems.

"None" is what basically vanilla SMTP does, except that vanilla SMTP does inherently validate (by use of TCP instead of datagrams) that the sender in fact is at the IP address claimed. So use of TCP implies a low-level Challenge/Response that would not be implied by a complete message (envelope, headers and body) transmitted via UDP.

In this case, the message (unless otherwise filtered) is exchanged with a minimum amount of delays and resource expenditure. But it might be spam.

The remaining approaches rely on DNS to be working in order to support a "backward lookup", by the recipient, of the domain name used by the sender. (The "forward lookup" is originally done by the sender in order to locate a server to contact in order to send the email, but the DNS components that have to be working for that lookup to work are usually not the same ones that are needed by the recipient to do a backwards lookup on the sender's identity.) This adds a point of failure, and a DNS-lookup-sized delay.

"Sender Domain Exists" means the receiving server makes sure that, for a message claiming to come from "", the domain actually exists. If it doesn't, the message is almost certainly spam (or comes from someone at a brand-new domain).

Even if the domain does exist, that doesn't mean the message in question actually originated from anyone at that domain, nor that the domain isn't owned by spammers, so the message might still be spam.

SPF means that publishes valid sources for email coming from its user base. If the domain publishes SPF records but doesn't list the actual source (ultimately an IP address) of a particular email, the server can assume the message is almost certainly spam (or is sending the message from a location not accommodated by the domain's SPF record, perhaps as a result of the message having been forwarded by an unlisted server).

Even if the source of the message is valid according to SPF, that doesn't mean the message in question actually originated from anyone at that domain, since spammers might have access to injection points allowed by SPF, nor does it mean the domain isn't owned by spammers, so the message might still be spam.

The remaining approaches further rely on a server operating on behalf of the sender to be working, in order to support some kind of response from that server. This adds another point of failure, and a client/server-exchange-sized delay.

"Sender Server Responsive" means that a server running at is reachable and thus appears at least willing to accept bounces, abuse complaints, etc. So the likelihood of being able to quickly bounce a message that is accepted and then isn't finally deliverable is much higher, suggesting that the source of the message is a "good netizen".

However, some server operators, upon noticing repeated incoming connections that don't attempt to deliver legitimate email, might choose to block further connections coming from anyone employing this technique, as it is considered potentially abusive (it can target innocent-bystander servers).

(In some ways, this approach is orthagonal to using SPF; but it requires both DNS and an SMTP server to be working, whereas SPF requires only DNS to work.)

Even if the server is running, that doesn't mean the message in question actually originated from anyone at that domain, nor that the domain isn't owned by spammers, so the message might still be spam.

The remaining approaches further rely on the server operating on behalf of the sender to be fully aware of the existence or non-existence of that sender at most or all times, in order to support a useful response from that server regarding the validity of the sender. This adds yet another point of failure, and a client/server-exchange-sized delay.

Sender Callout, also called Sender Address Validation (SAV), means that the server running at is tested to see whether it recognizes (the alleged sender) as a legitimate address for receiving email. This is done either via the SMTP "VRFY" request or via the SMTP "MAIL" and "RCPT" requests; the former is supported in a workable fashion by only very few servers, and the latter is considered abusive if it isn't followed by an actual email to be delivered (and, at that point, the server using Sender Callout has no such email prepared), assuming the server is even willing and able to respond to the "RCPT" request with a definitive indication of the legitimacy of the email address given (which some servers cannot or do not — for examples, qmail normally cannot, and because dictionary attacks can exploit a willingness and ability to provide such information, many servers do not provide it).

Some server operators, upon noticing repeated incoming connections that don't attempt to deliver legitimate email after sending "RCPT", might choose to block further connections coming from anyone employing this technique, as it is considered potentially abusive (it can target innocent-bystander servers).

And even if the server is willing and able to confirm the sender address is legitimate, that doesn't mean the message in question actually originated from anyone at that domain, nor that the domain isn't owned by spammers. So the message might still be spam.

The remaining approaches further rely on the server operating on behalf of the sender to be in "close communication" with that sender at most or all times, in order to support a useful response from that server regarding whether a message claimed to have been sent from the sender actually originated from that sender. This adds yet another point of failure, and a client/server-exchange-sized delay.

"C/R" is where C/R systems fall on this continuum. The server running on behalf of an alleged sender of a message is sent a Challenge. If the server believes the message that has been Challenged indeed originated from the alleged sender as intended by that sender, it sends a Response. It does this either by forwarding the Challenge to the sender (this is the way most early C/R systems worked) or by keeping track of all messages sent by the sender and responding automatically to Challenges without bothering the sender (this is how C/R systems would have to work if widely deployed).

But even if the server is able to confirm the sender sent the original Challenge and sends a Response, that doesn't mean the sender personally intended to send the original email (the sender's computer might be out of his control, perhaps infected by vermin), nor does it mean the domain isn't owned by spammers. So the message might still be spam.

The remaining approaches rely on the server running on behalf of the alleged sender to have a copy of the message. This adds yet another point of failure, and a message-exchange-sized delay.

"IM2000" means the server acting on behalf of the sender actually hosts the message being sent, so the sender need send only a notification of the message's existence to the recipient, leaving the recipient with only the task of retrieving the message from the server at his convenience. In this case, the server pays much of the cost for hosting spam (on behalf of its users, if they're spamming).

This is in many ways indistinguishable from the role a typical ISP's SMTP relay serves in a vanilla-SMTP setup, where dynamic-IP users hosted by that ISP are not allowed to send outgoing email directly to the Internet via port 25 (by the ISP or at least by certain other ISPs), but are directed to inject email into the sender's host ISP's SMTP relay. So we "recurse" on the original problem, with the server in the role of "sender"; how does a recipient know whether a message hosted by that server is spam?

But even if the server has a copy of the message, that doesn't mean the sender personally intended to send the original email (again, the sender's computer might be out of his control), nor does it mean the domain isn't owned by spammers. So the message might still be spam.

The remaining approach(es) rely on a server running on behalf of both the recipient and the sender, and handling all the recipient's incoming email from that sender (and thus doing the filtering). This adds yet another point of failure and a message-exchange-sized delay, as well as a poll-rate-sized delay.

"POP3/IMAP" means a server acting on behalf of the recipient hosts the notification for each incoming message, as well as the message itself; the recipient periodically polls the server to see whether any new email has arrived. In this case, the POP3/IMAP server must be careful to not accept responsibility for any message that is spam, else it bears the cost of hosting spam on behalf of its users, who are not necessarily spammers. This "works" only if the server is not hosting spam on behalf of senders of spam.

This approach is akin to insisting on receiving email only from one (or several) "big-name email hosts", either by having an email address explicitly hosted there, or by accepting email only from such hosts.

But, in this scenario, the burden of avoiding spam is shifted onto the POP3/IMAP server when it is receiving incoming email. So we "recurse" on the original problem, with that server in the role of "recipient".

Further, though we might assume the POP3/IMAP server isn't itself owned by a spammer, there are "free" hosting services that nevertheless add advertising content to incoming and/or outgoing email. So it is unclear how a vendor of this type would successfully resist allowing some "standalone" spam to be injected into their system in order to defray the substantial expense of hosting email on behalf of huge numbers of users while "blocking spam".

Also, to the extent any such service successfully hosts email for a large number of users and still blocks spam, it becomes a tempting target for spammers to attack — from without or within, using technology-based attacks, using litigation, using "insiders" (including corporate officers), and so on.

Accordingly, a message fetched from such a server might still be spam, or would likely contain spam, in order to support its "free" use.

Note carefully: After adding all of the "checks" shown on the continuum, along with the corresponding points of failure and delays, the end result is the same: "a message might still be spam". And the email system as a whole, while becoming much more brittle, would still be an environment in which spammers flourish, as they would be freely able to choose through which of the added "hoops", if any, to jump.

But those of us sending and receiving legitimate email would have to jump through all the hoops. To the extent we fail (or refuse) to do so, legitimate email would not get exchanged, or at least would be delayed. That strikes me as little different from everyone staying at the leftmost point of the continuum, with spam causing mail to be delayed or lost, except that message exchange between sufficiently mutually trusting entities remains reliable and fast by avoiding all the add-ons on that continuum.

That strongly suggests we rethink widespread adoption of any approach that goes beyond the leftmost point on the continuum shown above, and leave validation of sources as a task for end-user recipients to perform for messages they believe demand it of them.

Question: Are you saying automated validation of message sources isn't important?

My answer: Conceptually, for some messages, it is important to determine whether the message actually originated from the claimed originator. (For some of those messages, it can be important to make sure the originator cannot "repudiate" having sent the message, though that's outside the scope of this document.)

One problem with C/R and similarly "heavyweight" proposals is that many legitimate messages don't require validation of the field(s) those proposals attempt to validate.

In such cases, these proposals interfere with successful delivery of messages that might otherwise be self-validating. (Self-validating messages might include "ok, see you at 3", "check out what's showing on Vole News Network right now!", "compare your proposal to", and "Never give a gun to ducks"; the recipient typically either already knows who sent such messages, can determine who sent them by indicators within or external to the content, or simply does not care who actually sent them or even whether the alleged sender actually sent them.)

Another problem with C/R and related proposals is that they validate only the "envelope sender" of an email, to the extent they validate anything at all.

But so-called "phishing" and other attacks do not depend solely on forging the envelope sender. They can also forge the "From:" or other headers within the email, even forge the identities of entities described within the contents of the email.

Consider the following sample email (modified from an actual spam I recently received):

Return-Path: <>
Received: (qmail 25083 invoked from network); 17 Nov 2006 08:59:12 -0000
Received: from unknown (HELO cudo) (@ by with
        SMTP; 17 Nov 2006 08:59:12 -0000
Received: from (HELO by with esmtp
        (U8M0B9>/*0 >=.KM) id 2K7..7-A,X0IM-QF for;
        Fri, 17 Nov 2006 08:59:19 -0060
Date:   Fri, 17 Nov 2006 08:59:19 -0060
From:   "Bill Gates" <>
X-Mailer: The Bat! (v2.00.8) Educational
X-Priority: 3 (Normal)
Message-ID: <>
Subject: Stop being obese and unhappy
X-Spam: Not detected
X-Antivirus: avast! (VPS 0649-0, 2006-11-15), Outbound message
X-Antivirus-Status: Clean

Flabbia -- The newest and most exciting fat loss product available - As scen on Oprah
Do you remember all the times when you said to yourself you would do
anything to get rid of this ugly fat everywhere? Fortunately, now no major
sacrifice is necessary. With Flabbia, the ground-breaking pound-melting
blend, you can get a healthier lifestyle and become really thinner. Have
a look at what people say!"I hate to admit it but I was a junk food addict. I ate all this trash
and just could not stop. This misery stopped when I started taking
Flabbia! God, my appetite decreased, mood improved and I lost 20 pounds
in 2.5 months. I can tell you now I'm a happier person!"Amely S., San Diego
"I had weight problems since a boy. You can't imagine how I hated being
mocked at school. I hated the weight and I hated myself. After trying
this and that I found out about Flabbia. This stuff literally pulled me
out of this nightmare! Thanks and thanks and thanks to you, guys."Dave Klark, Boston
"You know what? Flabbia saved my marriage! I got into this circle,
depression - eating more - more depression. My wife was about to leave
the overweight psycho I was turning in. One of my friends pointed to
your site, and I ordered my pack of Flabbia right away. The results were
great, my appetite became normal, I was in a good mood oftener, and of
course I went some belt holes back. And you know, the sex became
fantastic, too!"Jack
There are loads of testimonials happy people leave after trying Flabbia.
Why don't you join the thousands of joyful beautiful people and try this all-natural,
appetite-suppressing energy boosting product now!
Don't miss your chance!

Besides validating the email address (which might exist), its alleged owner (the name might actually correspond to that address), and that the message originated from the owner's machine (which might be infected), here are just some of the "not-so-obvious" identities that should be validated:

Though the above is an "obvious" example of spam, it illustrates that even a substantially valid email might, within its contents, make a claim that a conscientious person might like to follow up on, requiring validation of the identity (or identities) of the source(s) of the claim.

There is nothing intermediate (3rd-party) email systems can do to avoid the need for conscientious filtering of messages that contain such claims.

They can only reduce the tendency of end users to trust things that appear "technical", such as envelope senders and URLs, to have been already validated, when they are in fact forged. In the above example, that means only the envelope sender, the "From:" header, and the "X-Spam" and "X-Antivirus" headers (which were not added by my server, they were in the original spam) can be automatically checked for any degree of validity, as could any embedded URLs.

And the best way to reduce the tendency of people receiving email to assume anything has been validated is to explain to them that such validation is simply not done by intermediaries — so they comprehend that nothing is necessarily trustworthy — rather than doing a partial job and leaving end users not knowing which "fields" have been validated and which have not.

Further Reading

Why Challenge-Response is a Bad Idea clearly and concisely describes most of the problems C/R poses, and includes links to similar writings by others. (Note that I don't believe Bayesian filtering is a complete solution to UBE; even if it does a "perfect" job for any particular recipient, it cannot directly reduce the resources consumed delivering UBE to the recipient's filter, and will tend to encourage spammers to increase their spamming as well as tailor it to evade most users' filters.)

Challenge-Response Anti-Spam Systems Considered Harmful provides another useful take on C/R, and I've read and re-read it numerous times in the past. It includes links to similar writings by others. (Note that I don't believe "teergrubing mail servers" are an appropriate solution to UBE; the goal is not to increase resource utilization for transferring email to everybody; rather, I believe the goal should be to minimize the resources needed, including latencies, to transfer desired email from sender to recipient.)


Back to my "hostile environments" page.

Copyright (C) 2006 James Craig Burley, Software Craftsperson
Last modified on 2007-07-10.