Difference between revisions of "SpamAssassin"
(7 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | We | + | == Setup == |
− | + | ||
+ | We are using a fairly standard SpamAssassin setup, close to the default. Any variations from default '''MUST''' be noted here. | ||
+ | Spam is getting out of hand, so the most basic setup is no longer sufficient. | ||
+ | |||
+ | === Basic === | ||
+ | |||
+ | A reference in /etc/postfix/master.cf lets the mail system know to use spamassassin, i.e. "spamd" | ||
+ | |||
+ | You can check that spamassassin does not have errors in the configuration with: | ||
+ | <pre> | ||
+ | spamassassin --lint | ||
+ | </pre> | ||
+ | |||
+ | To make sure it is tagging spam properly you can send it a test: | ||
+ | <pre> | ||
+ | spamassassin -D < /usr/share/doc/spamassassin-3.3.1/sample-spam.txt | ||
+ | </pre> | ||
+ | There is also a no-spam file there. Note that the -D gives a TON of output for debugging, and is not needed for testing basic functionality. | ||
+ | |||
+ | Detailed info on spamassassin setup is found at: http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html | ||
+ | |||
+ | === Sieve === | ||
+ | For spam filtering to work, each user needs a sieve script that directs the spam somewhere else. The most basic .sieve script is: | ||
+ | <pre> | ||
+ | # a simple SPAM filter | ||
+ | # | ||
+ | require "fileinto"; | ||
+ | |||
+ | if header :contains "X-Spam-Flag" "YES" { | ||
+ | # | ||
+ | # move messages with "X-Spam-Flag: YES" header | ||
+ | # into "spam" folder | ||
+ | # | ||
+ | fileinto "INBOX.SPAM"; | ||
+ | } | ||
+ | </pre> | ||
+ | |||
+ | === Plugins === | ||
+ | |||
+ | SpamAssassin plugins are found in: /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin | ||
+ | |||
+ | ==== AutoWhiteList ==== | ||
+ | |||
+ | See: http://wiki.apache.org/spamassassin/ManualWhitelist | ||
+ | |||
+ | The old whitelist plugin was called AWL, the new one, which we use as of September 2014 is TxRep. | ||
+ | This automatically adds spam messages to a blacklist and not-spam (i.e. ham) to a whitelist. | ||
+ | |||
+ | It is turned on in init.pre: | ||
+ | <pre> | ||
+ | # Learning Module: see http://truxoft.com/resources/txrep.htm | ||
+ | loadplugin Mail::SpamAssassin::Plugin::TxRep | ||
+ | </pre> | ||
+ | |||
+ | With some options in local.cf | ||
+ | <pre> | ||
+ | # | ||
+ | # For the TxRep module | ||
+ | # | ||
+ | header TXREP eval:check_senders_reputation() | ||
+ | describe TXREP Score normalizing based on sender's reputation | ||
+ | tflags TXREP userconf noautolearn | ||
+ | priority TXREP 1000 | ||
+ | </pre> | ||
+ | |||
+ | Currently it is setup to use the '''USERS''' directory to store the spam/not-spam lists, which is also true for the BAYES analysis. The use MUST have a directory ".spamassassin". To initialize the files in that directory run: | ||
+ | <pre> | ||
+ | sa-learn --sync # For Bayes. | ||
+ | </pre> | ||
+ | |||
+ | To check if there is a list: | ||
+ | <pre> | ||
+ | sa-learn -dump | ||
+ | </pre> | ||
+ | |||
+ | To teach the list from an IMAP mailbox that contains spam messages and is called SpamLearn: | ||
+ | <pre> | ||
+ | fetchmail -a -n --folder SpamLearn einstein.unh.edu -m "sa-learn --spam --single" | ||
+ | </pre> | ||
+ | Note to add --keep to the fetchmail line if you don't want these messages to be automatically deleted. | ||
+ | |||
+ | Similarly, you can tell it what is good emails with: | ||
+ | <pre> | ||
+ | fetchmail -a -n --folder GoodMail --keep einstein.unh.edu -m "sa-learn --ham --single" | ||
+ | </pre> | ||
+ | |||
+ | == SPAM Blacklist & Personal Configurations == | ||
+ | |||
+ | There are many blacklists for known spammers. We should use them! SpamAssassin does check the blacklists, the trouble is that it is too timid in labeling the resulting hits as spam, hence you get lots of spam in your inbox. | ||
+ | |||
+ | You can check if an ip is listed in the block list by following the recipe at: [http://daemonforums.org/showthread.php?t=302 Check using dig]: | ||
+ | <pre> | ||
+ | take IP address a.b.c.d and reverse and add zen.spamhaus.org, then do a dig on that, i.e.: | ||
+ | dig d.c.b.a.zen.spamhaus.org | ||
+ | If it returns a 127.0.0.x it is a confirmed spammer | ||
+ | </pre> | ||
+ | |||
+ | You can check the details (really detailed!) of how spamassasin scores a message by saving the full message and then piping it into spamassasin: | ||
+ | |||
+ | <pre> | ||
+ | spamassasin -D < message.elm | ||
+ | </pre> | ||
+ | |||
+ | Part of the trouble with the DNS blacklists is that the default spamassassin setting do not rate them as bad enough. The following settings help delete more spam. They can be further fine-tuned by adding additional score rules, including URIBL_BLACK. , Note that it is possible to get false positives (i.e. good mail marked as spam) this way. See below for testing first whether you would have gotten false positives. To increase the spam score for any of the SpamAssassin tags, edit the file ~/.spamassassin/user_prefs and add the following lines: | ||
+ | <pre> | ||
+ | # spamhaus DBL | ||
+ | # | ||
+ | score URIBL_DBL_SPAM 7.0 | ||
+ | score URIBL_SBL 3.0 | ||
+ | # Abusebutler | ||
+ | score URIBL_AB_SURBL 5.0 | ||
+ | # | ||
+ | </pre> | ||
+ | |||
+ | You can whitelist particular domains as well. This speeds up processing: | ||
+ | <pre> | ||
+ | whitelist_from *.jlab.org | ||
+ | whitelist_from *.unh.edu | ||
+ | whitelist_from *.google.com | ||
+ | whitelist_from *.gmain.com | ||
+ | whitelist_from *.yahoo.com | ||
+ | </pre> | ||
+ | |||
+ | === Testing for specific tags === | ||
+ | |||
+ | If you want to fine-tune your spam scores, you may want to test whether a previous message in a particular "good" mail folder got tagged by this spam tag. You can use fetchmail to check. The following example checks for URIBL_BLACK in a folder MyMail: | ||
+ | <pre> | ||
+ | fetchmail -a -n --folder MyMail -s --keep einstein -m "grep URIBL_BLACK" | ||
+ | </pre> | ||
+ | If it finds any, you may not want to set your URIBL_BLACK score too high, since mail from that source would then be labelled spam. | ||
+ | |||
+ | == '''Important Note''' == | ||
+ | SpamAssassin needs to have a user account named spamd, and this has to be a local account as well as being in the LDAP database. |
Latest revision as of 16:58, 1 March 2015
Setup
We are using a fairly standard SpamAssassin setup, close to the default. Any variations from default MUST be noted here. Spam is getting out of hand, so the most basic setup is no longer sufficient.
Basic
A reference in /etc/postfix/master.cf lets the mail system know to use spamassassin, i.e. "spamd"
You can check that spamassassin does not have errors in the configuration with:
spamassassin --lint
To make sure it is tagging spam properly you can send it a test:
spamassassin -D < /usr/share/doc/spamassassin-3.3.1/sample-spam.txt
There is also a no-spam file there. Note that the -D gives a TON of output for debugging, and is not needed for testing basic functionality.
Detailed info on spamassassin setup is found at: http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html
Sieve
For spam filtering to work, each user needs a sieve script that directs the spam somewhere else. The most basic .sieve script is:
# a simple SPAM filter # require "fileinto"; if header :contains "X-Spam-Flag" "YES" { # # move messages with "X-Spam-Flag: YES" header # into "spam" folder # fileinto "INBOX.SPAM"; }
Plugins
SpamAssassin plugins are found in: /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin
AutoWhiteList
See: http://wiki.apache.org/spamassassin/ManualWhitelist
The old whitelist plugin was called AWL, the new one, which we use as of September 2014 is TxRep. This automatically adds spam messages to a blacklist and not-spam (i.e. ham) to a whitelist.
It is turned on in init.pre:
# Learning Module: see http://truxoft.com/resources/txrep.htm loadplugin Mail::SpamAssassin::Plugin::TxRep
With some options in local.cf
# # For the TxRep module # header TXREP eval:check_senders_reputation() describe TXREP Score normalizing based on sender's reputation tflags TXREP userconf noautolearn priority TXREP 1000
Currently it is setup to use the USERS directory to store the spam/not-spam lists, which is also true for the BAYES analysis. The use MUST have a directory ".spamassassin". To initialize the files in that directory run:
sa-learn --sync # For Bayes.
To check if there is a list:
sa-learn -dump
To teach the list from an IMAP mailbox that contains spam messages and is called SpamLearn:
fetchmail -a -n --folder SpamLearn einstein.unh.edu -m "sa-learn --spam --single"
Note to add --keep to the fetchmail line if you don't want these messages to be automatically deleted.
Similarly, you can tell it what is good emails with:
fetchmail -a -n --folder GoodMail --keep einstein.unh.edu -m "sa-learn --ham --single"
SPAM Blacklist & Personal Configurations
There are many blacklists for known spammers. We should use them! SpamAssassin does check the blacklists, the trouble is that it is too timid in labeling the resulting hits as spam, hence you get lots of spam in your inbox.
You can check if an ip is listed in the block list by following the recipe at: Check using dig:
take IP address a.b.c.d and reverse and add zen.spamhaus.org, then do a dig on that, i.e.: dig d.c.b.a.zen.spamhaus.org If it returns a 127.0.0.x it is a confirmed spammer
You can check the details (really detailed!) of how spamassasin scores a message by saving the full message and then piping it into spamassasin:
spamassasin -D < message.elm
Part of the trouble with the DNS blacklists is that the default spamassassin setting do not rate them as bad enough. The following settings help delete more spam. They can be further fine-tuned by adding additional score rules, including URIBL_BLACK. , Note that it is possible to get false positives (i.e. good mail marked as spam) this way. See below for testing first whether you would have gotten false positives. To increase the spam score for any of the SpamAssassin tags, edit the file ~/.spamassassin/user_prefs and add the following lines:
# spamhaus DBL # score URIBL_DBL_SPAM 7.0 score URIBL_SBL 3.0 # Abusebutler score URIBL_AB_SURBL 5.0 #
You can whitelist particular domains as well. This speeds up processing:
whitelist_from *.jlab.org whitelist_from *.unh.edu whitelist_from *.google.com whitelist_from *.gmain.com whitelist_from *.yahoo.com
Testing for specific tags
If you want to fine-tune your spam scores, you may want to test whether a previous message in a particular "good" mail folder got tagged by this spam tag. You can use fetchmail to check. The following example checks for URIBL_BLACK in a folder MyMail:
fetchmail -a -n --folder MyMail -s --keep einstein -m "grep URIBL_BLACK"
If it finds any, you may not want to set your URIBL_BLACK score too high, since mail from that source would then be labelled spam.
Important Note
SpamAssassin needs to have a user account named spamd, and this has to be a local account as well as being in the LDAP database.