[Techtalk] Re: Blog spam

Meredith L. Patterson mlp at thesmartpolitenerd.com
Wed Oct 13 08:42:23 EST 2004


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> Recently, my blog has begun getting inundated with
> comment spam.

This topic came up a while ago on the hashcash developers' list,
actually. Hashcash is a "proof-of-work token" technique which attempts
to make it costlier (in a number-of-cpu-cycles sense) for people to send
out large volumes of email. The idea is that for every email sent, the
sender must compute and add to the headers a string consisting of some
message-specific info (the recipient's address, a timestamp, &c.) plus
tacked-on random characters, such that the SHA1 hash of the string has a
partial collision with the all-zero string (ie, has a certain number of
leading zeroes; the number of leading zeroes is the "value" of the
stamp). The recipient can verify the validity of a stamp quickly, but
every stamp must be generated by brute force, and thus every email takes
a few extra seconds to send, which of course really adds up for
spammers. The idea is that a user could configure his MUA, or a server
admin her MTA, to reject email that doesn't carry a stamp -- so spammers
can either have their volume drastically reduced because they have to
stamp, or watch all their non-stamped spam vanish into the ether.)

(Yes, it's one of those "this would only really work if everyone did it"
solutions, but it's full of fun math. More details at www.hashcash.org.)

Anyway: A fellow named Mitch Denny brought up the idea of using hashcash
for blog comment submission forms. His original idea was JavaScript, and
he's got a post on it at http://notgartner.com/posts/572.aspx. I'm not
certain whether he's finished that JS implementation or not, but another
idea which came to mind was doing it in XSLT. I know XSLT is mud-slow,
but as I understand it, every modern browser can parse it, and it's
sandboxed. I've been thinking about doing such an implementation myself,
though I know very little about XSLT, and while I know it's
Turing-complete, it seems that most books on it are written for an
audience which is focused on presentation techniques rather than
algorithm implementation.

(This stands to reason, of course. I mean, sed is also Turing-complete,
but all the books on sed talk about using it to edit text, because
that's what it was intended for. Still, if anyone knows of any good
resources on doing serious algorithm implementation in XSLT, I'd love to
hear about them.)

I know this is a bit far afield from the original request, but I hope it
gives some ideas. :)

Cheers,
Meredith L. Patterson
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBbTC/srMg4RkUokIRAvW8AJ9iuRcBkEWLkzF8XN++YstgsaK6TwCcCUL7
D8Bd16Mzm9zfw4+v+w6PGmg=
=O1I1
-----END PGP SIGNATURE-----


More information about the Techtalk mailing list