[Techtalk] Spam solutions

jennyw jennyw at dangerousideas.com
Wed Aug 18 18:46:53 EST 2004


Spam is getting out of hand. Okay, it got out of hand a while ago, and now
I'm finally getting around to going to the next level. Just wondering what
interesting anti-spam solutions people have had good experiences with.

The main thing I do right now is leave my main e-mail account unfiltered,
look through it for good messsages, then run a Perl script that connects to
my IMAP server and moves all the messages not on my approved address list to
a new folder, where, theoretically, I scan messages once more.  This way, I
should get at least one look at all my messages before filtering.  Previous
to this, I tried using spamassassin with my Perl script, too, but I found it
took way too long to run on the several hundred spam messages I get a day,
and I also found that spamassassin has had false positives and also misses a
lot of spam. On my other accounts (like my mailing list account, aka this
one) I use Procmail for filtering, and send all unrecognized e-mail to
spamassassin (same problem as with the Perl script, but without the
slowness). Of course, I also established a new e-mail account that doesn't
get any spam, but I kind of feel like that's giving up.

Right now I'm ready to try the Bayesian filters, but I haven't had much
experience with them and I've read several people say that spamassassin's
Bayesian filtering isn't great.

I'm now trying out the IMAP filter that comes with spambayes, which is
written in Python. The training takes forever (training 1,000 messages can
take more than a couple hours on my Athlon Thunderbird 800 Mhz system).
Filtering takes a while, too, if you try to throw a couple thousand messages
at it, but for small numbers of messages, it's fairly speedy (it's odd -- it
seems like the time to process messages increases geometrically based on the
number of messages you throw at it at a given time).  I like that this is in
Python, because I'm starting to use Python for most of the stuff I used to
use Perl for, but I'm not sure about the efficacy of this solution yet,
though, and I can't find much info (aside from their site) that talks about
how well it works. Also, this is the only Bayesian filter I know of that has
an IMAP filter.

The other product I'm looking at is DSPAM. This looks really promising. The
problem is that I don't know how easy it will be to send messages to it for
training. If anyone has experience with it, I'd love to hear about it,
though, including how you decided to set it up.

FYI, I run Debian testing, postfix, courier-imap, and procmail. I use
fetchmail to pick up e-mail (so there probably wouldn't be a point to using
RBL -- also, I'm not sure about how much real e-mail those might block), but
there's one I leave on the ISP server (in case my computer goes down), which
is the one I use the filtering Perl script for. The e-mail clients I use are
mutt (Linux), Mozilla Thunderbird (Mac OS X), and Outlook Express (Windows).

The greylisting that Andrea mentioned recently looks every interesting, too,
although I'd be curious how it works with things like TDMA (I've thought
about implementing TDMA, too, but I'm hesitant due to the potential
annoyance factor). If anyone has experience with greylisting, in particular
using postfix, that'd be great. I'd be open to switching to exim, too, but
I'd be a bit concerned about switching to sendmail (it scares me).

Anyone want to share their adventures in spam?

Jen



More information about the Techtalk mailing list