Did You Get My Mail?

(Yes, I’m back, and working through the backlog.)

It seems to me that email bounce messages have been rendered next to useless by viruses and other email-borne malware either forging From: addresses (thereby landing you with all the bounces for the bad email addresses it has collected) or even using fake bounce messages as a vector. The volume of bad bounces I get has reached such a level that (I suspect) the Bayesian filtering system in my copy of Mozilla Mail has started to recognise characteristics of a bounce message as spam-likely. So, any genuine bounces may well get sent to my spam folder. And, when I clean it out, I can’t recognise which bounces might be genuine except by reading every one, because the bounce message does not have the same subject line as the original.

Thought: would it be possible to implement an extension to Thunderbird which recognised genuine bounce messages and flagged them up? The only way I can see this working is by correlating the recipient to which the bounce message relates against the To/CC/BCC headers of recently-sent email.

The problem is that, as far as I can see, there’s no standard header in bounce messages which tells you the failed email address. Even the From: address of the bounce isn’t guaranteed to be at the same domain.

For example, if you send mail to thiswillbounce@gerv.net, then you’ll get a bounce message back from Mail Delivery System <Mailer-Daemon@tuschin.blackcatnetworks.co.uk> – nothing to do with gerv.net. The only new header seems to be X-Failed-Recipients: markham-gerv-thiswillbounce@tuschin.blackcatnetworks.co.uk – again, you can’t get thiswillbounce@gerv.net from that.

Sometimes, you do get a full copy of the original message, so in that case you can grep for the To or CC lines. But I think that, even for those MTAs which send you back a copy, some send it back inline and others as an attachment. More places to look. Of course, if we signed all our email, we could look for our own signature on the bounced copy…

I suspect one would be reduced to parsing the body and looking for common text for each major brand of mail server software (“Your message to x@y.com has…”). But, given that such messages are customisable and localisable, that also seems to me like a hiding to nothing.

Do we need an X-Failed-Recipient: header which gives the exact original email address from the To:/CC:/envelope? That way, fake bounces could be binned.

Even if we had that, we also still have the problem of defining “recently sent email”. I have two machines from which I access my IMAP accounts. They have different address books, and Collected Addresses. The only way to know who I’ve emailed in the past week is to search my sent-mail folder. This is all beginning to sound like a Fairly Hard Problem.

5 thoughts on “Did You Get My Mail?

  1. There is actually a standard for doing this kind of thing – RFC 1891. It includes specifying the original recipient at the SMTP level, as well as identifying a particular message. It also defines a format for bounce messages which is easily parsable. Unfortunately not all that many servers support it.

    If you’re reduced to parsing bounce messages, I would have thought the best way to do that would be to look for the message-id, which should be unique for each message, and should make it easier to match up with an outgoing message. But that still has the problems you mention…

  2. That reminds me, did you get my last relicensing patch I sent you. I’m not sure of the date but it was before you went away. I’m worried it my have got caught by your spam filters.

  3. Michael is right: it wouldn’t be too hard to write code to find the Message ID in the body of the bounce. So the only remaining problem is a canonical list of message IDs for messages sent by a particular user. Each client would need to keep its own cache based on regularly checking the sent-mail folder, and also intercepting all mail sends (to make sure you caught the immediate-bounce case)…