[Techtalk] Spam solutions

jennyw jennyw at dangerousideas.com
Mon Aug 23 17:23:29 EST 2004


On Sat, Aug 21, 2004 at 09:15:09AM -0600, Andrea Landaker wrote:
> With greylisting, the amount of spam that gets through is so small that
it's
> been trivial to check the spam folder for false positives.  We're using
> postfix with a daemon called GLD (http://www.gasmi.net/gld.html).  There
are
> several other implementations, however, that you can see at
> http://greylisting.org/implementations/

Greylisting sure sounds great! However, in my case, my mail server gets its
mail using fetchmail, and I have one account that I leave on the host (I
figure if I'm traveling or something this is better; I don't have to worry
if power goes out at home or if my Internet connection goes down). However,
I've been thinking about getting a colo server -- if I do that, then I'll
almost certainly use greylisting.

After looking at a bunch of different spam packages, I decided I really
wanted to try a Bayesian filter.  I look at a few, and the most promising
were DSPAM and SpamBayes.  I decided to try SpamBayes first mainly because
it has IMAP support, but I also like that it has an unsure folder and also
that it's written in Python (slower than C, but probably quicker to make
programming changes in).

I expected pretty good results after reading materials on Bayesian filters,
but I didn't really believe that until I put SpamBayes into action. What a
difference! After initial training (that included feeding it a lot of old
good mail and spam) I have no false positives in the spam folder so far,
very few spam messages in my inbox, and very few e-mails in my unsure
folder.

The IMAP support is great for my hosted account, or I use the IMAP program
to classify and train (run from cron on my server every 15 minutes). For my
local accounts (that use fetchmail), I call SpamBayes from Procmail to
classify, but train using IMAP. This means submitting new spam is as easy as
moving messages to a folder. IMAP is pretty slow, but I hear improvements
are on the way (I'm also thinking about adding changes of my own).

On Wed, Aug 18, 2004 at 07:55:10PM -0700, Robert Wichert wrote:
> OK, first of all, this involves both Linux and Windows.

Vaguely -- I want to filter on the server. I only mentioned Windows because
I'd like an easy way to be able to submit good and spam messages to a
Bayesian filter. The set up that I have now with SpamBayes works great
because I can just copy an email into an IMAP folder.  However, if power to
use a different filter, I might need to do something else to submit e-mails
to it, and I just wanted to make sure that it would work whether I'm using
mutt or Outlook Express.

One day, there will be great speech recognition for Linux or Mac OS X, and I
won't need to use Windows quite as much (RSI pains).


On Sun, Aug 22, 2004 at 10:56:13AM +1000, Kathryn Andersen wrote:
> I use maildrop instead of procmail because it is *so* much simpler to
> configure.  With procmail, I'd set up a recipe and then look at it six
> months later and wonder what it means.  With maildrop, it's much more
> straightforward, with actual things like "if"!

I keep thinking about switching to maildrop -- it certainly sounds a lot
better than Procmail.  However, I probably won't get around to doing that
until I need to revise my rules quite a bit.  It's just not broken off yet.

> I call spamassassin from maildrop, because even though it isn't that
> fast, it's still very good at nabbing spam.

I use to do that, too.  Now I call SpamBayes instead -- the performance
seems to be about the same as SpamAssassin, but the results are much, much
better.

> Last in line is TMDA, but I don't use all the features.  Contrary to
> popular belief, one *can* use TMDA without being obnoxious, and even
> without whitelisting.  I do have a whitelist -- made up on my initial
> install from addresses in my saved folders -- but that isn't my only use
> for it.

The way you describe using TMDA sounds great!  When I first read your
message, I thought that I should also try that, but after initial training
SpamBayes has turned out to work so well that I'm thinking about just
leaving things as they are.  After a few weeks, we'll see how well it's
keeping up.

Thanks everyone for the advice! If anyone has any other thoughts of spam,
I'd love to hear those, too.

Jen



More information about the Techtalk mailing list