SpamAssassin

From Nuclear Physics Group Documentation Pages
Revision as of 21:16, 23 September 2014 by Maurik (talk | contribs) (→‎AutoWhiteList)
Jump to navigationJump to search

Setup

We are using a fairly standard SpamAssassin setup, close to the default. Any variations from default MUST be noted here. Spam is getting out of hand, so the most basic setup is no longer sufficient.

Basic

A reference in /etc/postfix/master.cf lets the mail system know to use spamassassin, i.e. "spamd"

You can check that spamassassin does not have errors in the configuration with:

spamassassin --lint

To make sure it is tagging spam properly you can send it a test:

spamassassin -D < /usr/share/doc/spamassassin-3.3.1/sample-spam.txt

There is also a no-spam file there. Note that the -D gives a TON of output for debugging, and is not needed for testing basic functionality.

Detailed info on spamassassin setup is found at: http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html

Sieve

For spam filtering to work, each user needs a sieve script that directs the spam somewhere else. The most basic .sieve script is:

#  a simple SPAM filter
#
require "fileinto";

if header :contains "X-Spam-Flag" "YES" {
#
#  move messages with "X-Spam-Flag: YES" header
#  into "spam" folder
#
	fileinto "INBOX.SPAM";
}

Plugins

SpamAssassin plugins are found in: /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin

AutoWhiteList

See: http://wiki.apache.org/spamassassin/ManualWhitelist

The old whitelist plugin was called AWL, the new one, which we use as of September 2014 is TxRep. This automatically adds spam messages to a blacklist and not-spam (i.e. ham) to a whitelist.

It is turned on in init.pre:

# Learning Module: see http://truxoft.com/resources/txrep.htm
loadplugin Mail::SpamAssassin::Plugin::TxRep

With some options in local.cf

#
# For the TxRep module
#
header         TXREP   eval:check_senders_reputation()
describe       TXREP   Score normalizing based on sender's reputation
tflags         TXREP   userconf noautolearn
priority       TXREP   1000

Currently it is setup to use the USERS directory to store the spam/not-spam lists, which is also true for the BAYES analysis. The use MUST have a directory ".spamassassin". To initialize the files in that directory run:

sa-learn --sync   # For Bayes.

To check if there is a list:

sa-learn -dump

To teach the list from an IMAP mailbox that contains spam messages and is called SpamLearn:

fetchmail -a -n --folder SpamLearn  einstein.unh.edu -m "sa-learn --spam --single"

Note to add --keep to the fetchmail line if you don't want these messages to be automatically deleted.

Similarly, you can tell it what is good emails with:

fetchmail -a -n --folder GoodMail --keep  einstein.unh.edu -m "sa-learn --ham --single"

Important Note

SpamAssassin needs to have a user account named spamd, and this has to be a local account as well as being in the LDAP database.