I keep training SpamAssassin with the spam that slips through, but don't see any improvement is spam detection. The details are these:
Using ssh I trained SpamAssassin with examples of spam and ham I had saved for that purpose, perhaps 500 of each. I set up email recipes such that
spam@example.com would go through sa-learn --spam --mbox, and
ham@example.com would go through sa-learn --ham --mbox. Since then I have forwarded (as an attachment) any spam that slips through to spam@... and an equal amount of fresh ham to ham@...
That worked well for about a year. Spam in the inbox went from 150 per day to less than one per week on average, with no false positives. Naturally, spam blooms sometimes appeared, but training dealt with them.
I wrote up my spam setup in greater detail in these notes to myself:
http://my.opera.com/wpost/blog/spamassassin
For the past month or so, however, a steady stream of nearly identical spam has been getting through and despite training it all I see no change in spam scores on it. Here are relevant headers from a typical message:
(snip)
X-Spam-Status: No, hits=-6.6 required=3.5
tests=BAYES_00,RCVD_IN_DNSWL_MED,STOX_REPLY_TYPE autolearn=ham
version=3.002005
(snip)
X-AVES-Antispam: Maybe spam, 17.32 >= 4.00 [as:15.30 cc:0.00 hc:2.02
sa:17.32]
(snip)
Notice that a spam filter on an upstream server correctly flagged it, but my SpamAssassin. Nor does the score on these nearly identical messages seem to change with training.
Another clue: previously my spam folder was receiving fresh spam every hour. Since this problem began the spam folder receives almost nothing.
Sure, it would be easy to cook up a procmail recipe to filter on the upstream server's "maybe spam" header, but I'd rather fix the underlying problem with SpamAssassin, not cover it up.
Prior to coming here I searched the web, consulted my web host's knowledge base's articles on SpamAssassin, and read everything relevant at spamassassin.apache.org, but I remain stumped.
Any thoughts on what might I be doing wrong here, and what might I look at to improve spam training?