Greylisting
From: https://www.zytrax.com/security/greylisting.html
Greylisting and its derivative Tar-Pit Techniques
We currently believe that Greylisting (and its derivatives) together with SPF
are the most effective techniques to fight the ever rising tide of random
incoming SPAM. DKIM represents another approach that a mail sender may use,
combined with some reputation system, to authenticate that outgoing mail from a
particular source is legitimate.
It is estimated that over 15 billion SPAM messages are sent every day. Some days
it used to feel like they all arrived in our mailboxes.
As the volume of spam rises, the anti-spam tools and content filters are
becoming increasingly aggressive such that the number of false positives is
becoming perilously high. We probably all know of at least one incident where
our genuine email either got stuck in a spam folder or was probably discarded as
junk.
Classic Solutions
The problem in fighting spam is finding a cure that is not worse than the
disease.
We have reviewed and rejected some potential solutions:
- Black lists: We refuse to implement a Black List because we feel it can too
easily penalise legitimate mail while doing very little to stop SPAM - your SPAM
clogged mailboxes are witness to the total lack of effectiveness of Black Lists.
Having been the unwitting victim of a blacklisting, which took less than 2 hours
to fix when brought to our notice, but took two years for all the effects to
mostly disappear (though we still have one residual effect 5 years later) we
feel the implementations on average are not production quality. And have you
ever tried to contact a mail administrator in a busy email vendor about fixing
their dumb blacklist that are years out of date (and those are the good
implementations). On its own, Black Listing is a fatally flawed technique. In
combination with other techniques (liking using a reputable blacklist source and
updating it frequently) it can add value.
- Incoming Mail SPAM Filters: It is not up to us, nor should it be, to decide
what constitutes SPAM and what does not. One person's legitimate mail may be
another persons SPAM and vice versa. While not doing anything to demean the
quality of spam-filtering software, especially the newer generation of Bayesian
based filters, the technology relies on inspection of the mail content. This a
very subjective matter and will inevitably lead to false positives, which is why
most such systems place suspected spam in a special folder. You still have to
check this material - much of it profoundly offensive. How effective is that?
Finally, the key point. Neither of these techniques have affected the rate at
which SPAM can be generated by the bad guys. They do not hurt spammers, they
simply innoculate the receiving site from the effects of the bad guys. Indeed,
in the case of SPAM filters, the good guys pay all the penalty. SPAM filters are
serious users of CPU resources and, paradoxically, use the most resources when
they pass good mail, since it must, by definition, pass all the CPU-intensive
filtering tests. Bottom line, what have all these passive on-site tests done to
help keep the wider world SPAM free? Nothing. Absolutely nothing.
So now let's get positive and look at what can be done.
Hurt the Bad Guys and Help all the Good Guys
The economics of SPAM are at best marginal. Any attempt to make those economics
worse is bound to work against the spammers. This, seemingly trivial, insight
has profound consequences in fighting spam and has led to a whole new battery of
techniques including Greylisting (credited to Evan Harris in this 2003 article).
In practical terms, this insight means that doing anything which causes the use
of additional resources will disproportionately affect spammers. And have the
happy side-effect of reducing their capacity to send out any spam. Greylisting
was the first of a family of techniques that have the following broad
characteristics:
- Cause more effort to be expended by the spammer. Leaving them with less
resources to hit the good guys.
- Require tighter compliance with the specifications. A Good Thing™. Most
zombie mailers are, at best, trivial or marginal implementations of the mail
specifications. Simply rejecting mail that is not in full compliance is
surprisingly effective.
- Limit the rate that spammers can send email. And thereby the volume of span
that can be sent in any given period.
- Take more time to respond when it is from a known - or even suspected -
spam-source (here Black Lists can play a serious and useful role with no
downside risk). Implementations using this approach are generically referred to
as tar-pit techniques which conjures up a wonderful image of all movement
slowing down. Rather that doing the obvious thing, such as immediately reject a
suspected SPAM source which most software that uses Black Listing and other
similar techniques does, tar-pit software does exactly the opposite. It takes a
long, long, long time to send back the rejection (or any) message with the
maximum delay between each character. The bad guys are stuck until the last
character arrives (even if they decide that the receiver is a tar-pit they take
serious time to figure it out). And when the SPAM source is stuck communicating
with a tar-pit enabled system it can't be sending SPAM to someone else. Limiting
the bad guys to help all the good guys.
Grey Listing
Grey listing is currently the most highly developed, and resource light, of
these techniques and is implemented on our server (using postgrey) where it has
had a dramatic effect. Currently over 90% of the SPAM load has gone. Period. Not
SPAM filter in sight. No false positives.
Greylisting looks absolutely terrifying at first glance and works like this:
- Every time the mail server sees an email it constructs a unique triplet
consisting of the senders email address, the recipients email address and the
sending mail servers IP address. If the mail server has never seen this triplet
before it stores the information in a database - and then discards the email
with a temporary failure message. Yes. It throws the email away, without looking
at its content, and will not allow it to be retransmitted for a small period of
time (a blackout period) lasting perhaps two minutes and that is normally
configurable by the mail server operator.
- The email RFC's specify that compliant mail servers MUST retry under
temporary rejection conditions using some form of delay back-off algorithm.
Legitimate mail servers will retry, normally in 5 to 15 minutes, automatically.
The sender of the email is not involved with the process and sees no effect of
this temporary rejection/retry policy. Spammers may also retry but typically do
so immediately and get caught in the blackout period. In any case, spammers have
no real incentive to retry because it consumes more resources. A marginal
business just got more marginal. And a single retry operation has just reduced
the Spammers total capacity by 50%. You read that number right - 50%. Not too
shabby for a days work.
- Once the re-tried mail has been received, normally after the 5 - 15 minute
retry delay interval, the mail server marks the mail source as valid and will
not throw away anymore email for a period of time ( 4 -6 weeks, usually
configurable by the email operator). The whole process is self-regulating.
It all sounds too good to be true. And unfortunately it is. Problems can arise
for legitimate email in three areas:
- The first time mail is received from a new source it will be delayed by 5
to 15 minutes. Thereafter it will arrive normally.
- Some mail servers, typically through poor implementation, can take a long
time (multiple hours) to retransmit - even though, with normal mail servers,
this will typically be 5 to 15 minutes.
- Some very big mail servers farms may use a different sending IP address with
each retry - thus defeating the triplet mechanism.
There are a variety of implementation techniques that can both ameliorate the
initial delays and solve the problems identified above:
- Whitelists can be built to bypass checks from known good sources or domain
names.
- Many greylist implementations allow operators to set policies that will
permanently whitelist senders after receipt of a number of emails.
- The number of servers that have very long retries or use different sending
IP addresses is gradually being discovered and global whitelists are emerging.
However, since spammers could just use faked addresses from a whitelisted source
it is important this technique is used in conjuction with SPF which can then
catch this abuse. A classic 1-2 punch. No third strike required here.
Anti-spam is increasingly not a single technique, but rather a battery of
techniques. Serious work is being done in the area of email authentication (DKI
from Yahoo and others) and other techniques are at the idea stage. Sure the
spammers will fight back, but if the problem can be made manageable then we have
made progress. Maybe.
We feel very self-righteous because our Anti-Spam strategy is helping you. Now
let's talk about what your Anti-SPAM technique is doing to help us......