SPAM

Email Spam Coordination Across Different Services?

Send to Kindle

Surprise! It’s been 1,743 days since my last post. This isn’t a music post, so even if you are momentarily pleased to see a new one pop up in your feed, you’ll likely be disappointed by this “techie” one…

Disclaimer: this is 100% speculation on my part, I have done zero research to see if my theory is correct. I’m sharing it mostly to remember these thoughts as they occur, and perhaps in the distant hope that someone will either confirm or disprove the hypothesis with real proof.

Hypothesis

There is high level coordination among disparate email providers to determine bulk senders, in a completely unintuitive but ostensibly extremely clever manner.

I put the above hypothesis first, so that you can stop reading right now if you have no interest in this topic, because as is my usual style, this is likely to get long, quickly…

Background

I have operated my own email server for over 20 years. My CTO friends think I’m insane (perhaps for other reasons as well, but definitely for running my own email server). For clarity, it’s Postfix, not an email server that I wrote, just one that I operate on a dedicated server (where this WordPress blog is hosted as well).

I’ve been through the wars with email, sometimes caused by my misconfiguration or misunderstanding, but sometimes entirely out of my control (e.g., when my dedicated server was transitioned to a new data center and the static IP changed, and it had previously been on many RBL blacklists!).

Over time, I’ve tamed the configuration into a very stable setup. That has included complying with SPF, DKIM, DOMAIN-KEYS, DMARC, etc. Basically, anything that Google or Microsoft claim will help them validate email from me as a sender, and not mark it as spam or worse, just bounce it back to me.

The Problem

While my setup works flawlessly most of the time, on occasion, Lois or I will get a bounce back from someone (typically Google/Gmail, but sometimes Microsoft/Outlook/Hotmail). Once that bounce occurs, we’re often shut out from sending email to that service for a full day (rarely, longer!).

As you can imagine, it’s wildly frustrating to not be able to send an individual mail (this is not spam or bulk mail, but rather one to one emails to friends).

This is the error message we get in the bounce:

               The mail system

XXX@gmail.com: host gmail-smtp-in.l.google.com[74.125.20.26] said:
550 Action not taken (in reply to end of DATA command)

Wow, very helpful, “Action not taken”. Nothing indicating what we did wrong and why Google rejected the email. It feels like it should be a transient error, but it not only persists, it typically stops us from sending any further emails to anyone on that service for the rest of the day.

This has been going on for at least a couple of years. Just not often enough for me to pull out my (one remaining) hair trying to track it down.

What is going on?

Until this week, I literally had no idea (perhaps I should have diagnosed it earlier…). While I can’t say with certainty that my new understanding covers all aspects of this error, I can say with certainty one use case that definitely causes the error, and it might explain every single occurrence that we’ve had in this regard.

We (Lois more than I, but I do it too) share identical emails individually with a variety of friends. Specifically, if a group that we love puts out a new music video, Lois will send a link to that video to a group of people, but each will get their own separate email with the same email body and subject, so that they can reply just to us and not be BCC’ed in a large group.

I had never made the connection before that this somehow triggered the bounces, even though they were short emails, sent to people we’ve sent emails to 100’s of times, that almost always respond to those emails, and that we’re likely in the contacts list of the receivers. I couldn’t imagine that we were tripping any spam filter.

What do I think is going on?

I now believe (very firmly, with zero proof) that the body of the email (not including the subject, or the receivers) is being hashed. When another email with the identical hash (of the body) comes through (not sure if there is a time-limit or not), the service bounces it immediately with the above error message of “Action not taken”.

How did I come to this conclusion?

A week ago, I sent out invitations to a number of people to a house concert in March. Most of them went out fine, a very few bounced, so I didn’t have any suspicions over those bounces just yet.

This week, we discovered that one of the band members couldn’t make it, and we agreed with the head of the band that we would simply cancel the show. So, I sent an email to everyone that I had previously written to (except those that already said they couldn’t make it) telling them that the show was cancelled.

The cancellation emails were all bouncing, except for the first one!

I complained to Lois that the cancellations were bouncing, and that we would likely have to wait at least a day for the bounces to clear before I could send them again (still having no idea why they were bouncing).

Lois asked me why I thought the vast majority of the invitations didn’t bounce (to the same people, and those had links to the band in them, so if anything, they would have appeared to be more spammy).

It was a good question, to which I had no answer.

But, my brain often needs to sleep on problems before enlightening me, and indeed, the next morning I woke up with a theory to test.

While the bodies of the invitations were nearly identical, they each started with “Dear XXX,”. So, they couldn’t have hashed into the same exact body. On the other hand, each of the cancellations were identical in every way (copy/paste) without the lead Dear XXX. So, they indeed would hash into the same body.

To test the theory, I added back the “Dear XXX” to the cancellations, and sure enough, every single one went out without any bounces!

How is that Cross Provider Coordination?

Aha! It turns out that once Gmail (for example) bounced an email, so did Microsoft, Verizon/AOL/Yahoo, Apple (via @mac.com) and likely others (like Comcast).

To be clear, once Gmail bounced an email, if the next email went to hotmail.com, but was the very first such email to any Microsoft address, it was bounced immediately.

That implies (to me, proves) that it’s not just that they too hash incoming emails and bounce duplicates, but that they share the hash (somehow) among the competitive service providers, in order to more efficiently identify bulk senders quickly.

Please don’t ask me to conjecture on how they do that, I don’t have a clue.

Summary

We are both unbelievably relieved to understand what has been going on with these bounces (or at least to delude ourselves into thinking we understand it now).

Postscript

I doubt anyone who normally reads my blog will have an interest in this, but I needed to get it off my chest anyway.

I can only hope that an expert will weigh in and either confirm or provably deny my hypothesis.

Also, I have at least one more (completely picayune) email issue (rant?) that I will share if I get any feedback that this kind of stuff is interesting to anyone…

Converting from Procmail to Maildrop

Send to Kindle

I’ve been using procmail to filter mail on the server forever. I like it, so it’s important to note that even though I switched, I have nothing bad to say about procmail.

So, why did I switch? Procmail can be a little terse to read (obviously, I’m used to it by now). Over the years, I have built a large set of rules. There is a ton of cruft in there. If I wanted to clean it up, I had to rewrite it. Rewriting it in procmail was definitely a possibility.

But, over the years, I was also aware of maildrop as a filtering solution. It has a cleaner (more accurately, a more straightforward) syntax. The documentation is a little sparse (missing a few key examples IMHO). There are also thousands (if not millions!) of example lines of procmail available on the net, and it can be hard to find complex real-world examples of maildrop filters.

But, I knew that if I rebuilt my filters in maildrop, I’d be forced to rethink everything, since I couldn’t get lazy and just grab hunks of procmail from my current system and plop them into the new one. So, maildrop it was going to be!

One last time, just to make sure I don’t offend lovers of procmail (of which I am one!), everything that I did in maildrop could easily have been done in procmail. I just happened to choose maildrop for this rewrite, and for now, will stick with it. Perhaps if I ever revisit this project, I’ll iterate the next time back in procmail.

The goal of my filters is to toss obvious spam (send it to /dev/null). Likely spam gets sorted into one of three IMAP folders. The only reason I split it into multiple folders is so that I can test rules before turning them into full-blow deletes. Finally, mail that falls through those is delivered to my inbox.

Over the years, I added numerous rules to filter classes of spam (stocks, watches, viagra, insurance, etc.). Without a doubt, I introduced tons of redundancy. I didn’t scan all of the previous rules to see where I might be able to add one more line because it would be too tedious vs just adding a new rule.

I was reasonably satisfied with the result, but over time, became less aggressive about deleting mail automatically, preferring to stuff it in a spam folder for visual scanning during the day.

Since I create my own rules (I don’t run a system like SpamAssassin, which I did for a while), I can start to see patterns and simplifications over time, which was the impetus for the rewrite. In other words, there are more commonalities across classes of spam, and I don’t have to spend as much time categorizing things as I was bothering to do.

I’ve now made my first cut of the maildrop-based system. It’s been in production now for seven days, and I’m very happy with it so far. The one major change I made is to default to deleting things (in other words, much more aggressive than the previous system), but, I keep a copy of all mail in an archive IMAP folder that I will prune through a cron job, and never scan visually.

I review my delete logs once a day, so if I spot an email that looks like I shouldn’t have deleted it, or someone contacts me asking why I didn’t respond, I will be able to check the archive and have the full mail there (for some reasonable period of time).

Here’s the result of the rewrite:

The original procmail system had roughly 3800 lines it in (including comments and blank lines). The new maildrop system has under 550 lines, including comments and blanks. I delete more mail automatically, and in a week, haven’t deleted a single mail that I didn’t mean to. I am getting a few more emails sneaking into my inbox, but each day, I add a few more lines and the list gets shorter the next day.

Now that I am getting a bit more spam each day into my inbox, Thunderbird junk filters are getting more to train on, and they are getting better too, so even the junk that is getting in, is mostly getting filed in the Junk folder locally, automatically.

Here are two things that took me longer than they should have to figure out with maildrop (they are related, meaning the solution is identical in both cases, but it wasn’t obvious to me):

  • How to negate a test using !
  • How to use weighted scoring correctly (very simple in procmail)

Here’s a line in maildrop format:

if ($TESTVAR =~ /123/)

do something useful if true…

The above will “do something useful” if the variable TESTVAR contains the pattern 123. What if I want to “do something” if TESTVAR does not contain 123? Well, until I figured it out, I was making an empty block for “do something”, and adding an else for the thing I really wanted. Ugly.

My first attempt was to change the “=~” to “!~” (seemed obvious). Nope, syntax error. I then tried “if !($TESTVAR =~ /123/)”. Nope, syntax error. I then tried “if (!$TESTVAR =~ /123/)”. No syntax error, but it doesn’t do what I wanted.

I stumbled on the solution via trial and error:

if (!($TESTVAR =~ /123/))

Ugh. The ! can only be applied to an expression, which is normally (but not always?!?) enclosed in parens. But, the if itself requires an expression, so you need to put parens around the negated expression as well. At least I know now…

The second problem was weighted matches. I was having the same problem. Once I put parens around my expressions, it started working. That’s one of the few places where the procmail syntax feels a drop cleaner:

COUNT=(/123/:b,1)

COUNT=$COUNT+(/456/:b,1)

COUNT=$COUNT+(/789/:b,1)

echo $COUNT

So, the above sets the variable COUNT to the number of times that the string 123 exists in the body of the message. That is then added to the number of times that the string 456 exists in the body, finally adding the number of times that the string 789 exists in the body. The total is then echoed to the console. Without the parens, no workie.

I don’t like the fact that I have to maintain the running count myself. In procmail, you basically set a limit and the tests stop once the limit is reached (which feels way more efficient). There might be a way to accomplish that with maildrop too, but I haven’t found it as yet…

While I fully expect to add more rules, or lines to existing rules, I can’t imagine a scenario where my file will even double from here, so it will end up at less than 1000 lines. That will be easier to maintain for a number of reasons, most notably syntax readability.