Dan’s Blog

Food, Linux and Maybe Politics

Installing DSPAM on Dreamhost ( Nasty Kludge)

My favorite spam filter is, by far, DSPAM. It’s pure bayesian, so rather than relying on a bunch of blocked IPs and dirty words, it makes decisions based only on what you’ve previously tagged as spam. Crucially, this means that you still get the newsletters that you’ve opted in to.

I’ve deployed it as a SpamAssassin replacement with tremendous results. The problem with SpamAssassin’s bayesian filtering — maybe they’ve fixed this, I don’t know — is that it would, after a while, tag one or two spam messages as ham and then autolearn them as ham. That sort of behavior snowballs if uncorrected, and there wasn’t an easy way to correct that behavior in a sitewide deployment with POP users. So snowball it did, and pretty soon the fact that ~70% of our correctly addressed incoming mail (and well over 99% of all mail) was spam meant that people were getting a spam:ham ratio of about 1:1. At that point, email is unusable.

The thing that made DSPAM a strong replacement was that users can decide what is and what isn’t spam, so you’re bayes db doesn’t get poisoned. Using DSPAM, you can tag things as spam in one of two ways — either by forwarding emails on to a convenience address, or by dropping them into a folder set up for that purpose.

After getting the kinks out of the DSPAM deployment — you’d do well to have a separate mySQL server — it hums and runs with well over 99% accuracy with basically no intervention on the part of admins, and little intervention on the part of users.

Sadly, the lovely DreamHost, where this blog is hosted, only offers SpamAssassin. Then I found a howto for deploying DSPAM on Dreamhost, and got on my merry way. Everything went great through the corpus training, but as soon as I’d edited my .procmailrc, I ran into a terrible, terrible problem. Mail wasn’t getting filtered, and I was getting the following in my procmail log:

/home/dcheck/usr/bin/dspam: /lib/libc.so.6: version `GLIBC_2.3′ not found (required by /home/dcheck/usr/bin/dspam)
procmail: Error while writing to “/home/dcheck/usr/bin/dspam”
procmail: Rescue of unfiltered data succeeded

Oops! Turns out, I think, that DreamHost NFS-mounts your Maildir on its mail server, and over there, it runs an incompatible version of glibc. So when procmail is invoked on the mail server, and then it tries to run the dspam binary built by and for the web server, dspam is unable to find the version of glibc that it needs, and the whole enterprise fails badly.

Short version: you need to build the dspam binary on the mail server. I don’t have proper shell access to the mail server, but I do have procmail. And procmail is willing to run any script — as a filter — on the mail server, in the mail server environment. So the solution is pretty easy: write a “filter” to be run on the mail server that will configure, build and install DSPAM. And also mysql, because, for obvious reasons, mysql isn’t installed on the mail server.

So now, finally, an supplement to the standard installation guide. These are the steps to add after “Training” and before “Starting Filtering”:

Step 1: Download mySQL 4.1.X and DSPAM 3.X.X

Visit Nuclear Elelphant and mySQL for the latest source releases. Be sure to get the mySQL 4.1 release; this is what DreamHost runs, and we’re going to need compatible libraries.

Step 2: Untar them in $HOME/src

That’s:

$ cd
$ mkdir src
$ cd src
$ tar xvzf <package names>

… where <package names> are the names of the tarballs you’ve downloaded.

Step 3: Create the install script

Mine is below. You’ll need to change the directory names to match the versions of the files you downloaded.

#!/bin/sh

cat -

cd $HOME/src/mysql-4.1.15/

#./configure –prefix=$HOME/usr-mail >> $HOME/mysql.log
#make >> $HOME/mysql.log
#make install >> $HOME/mysql.log

cd $HOME/src/dspam-3.6.1/

./configure –with-dspam-home=$HOME/.dspam \
–with-userdir-owner=none –with-userdir-group=none \
–with-dspam-mode=none –with-dspam-owner=none \
–prefix=$HOME/usr-mail –enable-delivery-to-stdout \
–with-mysql-includes=$HOME/usr-mail/include/mysql \
–with-mysql-libraries=$HOME/usr-mail/lib –with-storage-driver=mysql_drv \
–enable-signature-headers >> $HOME/dspam-build.log

make >> $HOME/dspam-build.log
make install >> $HOME/dspam-build.log

Make this file ~/bin/install-dspam and chmod +x ~/bin/install-dspam.

Step 4: Add a procmail recipe

Open up .procmailrc in your favorite editor and add the following after all the VARIABLE= lines

:0Hfw
*^Subject: Install DSPAM
| $HOME/bin/install-dspam 2> $HOME/error.log

(The :0 means it’s a recipe; the H means that it should check the headers for the regex on the next line; the f means pipe it through the filter; the w means wait to for it to exit and report on what happened in the procmail logfile.)

Step 5: Email yourself

… with the subject line “Install DSPAM”. This, and only this, will trigger the install script.

Step 6: Wait

At some point the mail message you sent will be delivered to you; that’s a sign that the compilation is done. Check the various logfiles (~/error.log, ~/mysql.log, ~/dspam-build.log), and if everything looks basically kosher, you should have a new ~/usr-mail directory with binaries for both mySQL and DSPAM that will run on the mail server.

Step 7: Add procmail filtering

Remove the “Install DSPAM” lines and replace them with:

# Begin spam treatment.
:0fw
| $HOME/usr-mail/bin/dspam –stdout –deliver=innocent,spam –mode=teft \
–feature=chained,noise,whitelist –user $USER

:0
*^X-DSPAM-Result: Spam
.Spam/
# End spam treatment.

Now return to the other guide and finish up by adding correction! Note that any time dspam is invoked from the mail server, you will need to invoke it as $HOME/usr-mail/bin/dspam, and any time you invoke dspam from the web server (ie, via cron) it should still be invoked as $HOME/usr/bin/dspam.

Congratulations! You’ve got hyper-accurate spam filtered via a nasty kludge!

6 Comments so far

  1. w3 January 20th, 2006 3:57 am

    Hiya, just wanted to check how things were going with DSPAM on your dreamhost mail account before I knuckled into installing it for my mail account? Any problems/issues worth noting? I think I’ll try the standard how-to first in case Dreamhost have ‘fixed’ anything in the meantime. Down the line I will probably try to modify the installation to work with multiple accounts. I don’t understand enough about how Dreamhost mail routing works but I’d appreciate any heads-up on this if you’ve had any need.

  2. era May 5th, 2006 10:59 am

    Trying very, very, very hard to resist the temptation to send you an email with Subject: Install DSPAM just to see what might happen if I do it now. I do hope you have removed that from your .procmailrc after you ran it. (^:

    Harr, “Please enter a valid email address”. I already did! Welp, you’ll get billg@microsoft.com instead then.

  3. Stephen Galliver June 16th, 2006 11:34 am

    FWIW, as of June 2006, with DSPAM 3.6.8, procmail seems to be happily running the dspam executable as compilied on the web server. I did not have to follow the instructions on this page to compile dspam on the mail server.

    – sdg

  4. Dave November 16th, 2007 6:12 pm

    This isn’t right:

    3370: [11/16/2007 15:11:34] Unable to open file for reading: /mnt/r…r/vol/boot/s…y/m…w/d…e/usr-mail/etc/dspam.conf: No such file or directory
    3370: [11/16/2007 15:11:34] Unable to read dspam.conf

  5. kchr January 18th, 2008 3:25 am

    This is a really, really clever way to get things done on shell-less hosting accounts.

    Kudos!

  6. copyright Ukraine May 6th, 2010 11:32 am

    Thanks for taking this opportunity to discuss this, I feel fervently about this and I take pleasure in learning about this topic. If possible, as you gain information, please update this blog with more information. I have found it very useful.

Leave a reply

Mexico

RightsAgent Verification