spamassassin CHARSET_FARAWAY not working

mfoley · 05-03-2019, 06:50 AM

I posted this question 5 years ago, but didn't really get much response. I'll try again since I've had the problem recently.

In my spamassassin local.cf file I have:

score CHARSET_FARAWAY 3.0
score CHARSET_FARAWAY_HEADER 3.0
ok_locales en

Nevertheless, I received dozens of messages with the subject:

Subject: Re: Fwd: Undeliverable: Личное для тебя

Yet, no charset score assigned at all.

Why?

scasey · 05-03-2019, 01:26 PM

As I read it, CHARSET_FARAWAY_HEADER only fires if the character set of the header field (the Subject, in this case) is different than the character set of the body.
CHARSET_FARAWAY defaults to all. You've set it to en. It tests the character set of the body Was the body of the email in English? The header you posted appears to be in English.

Code:

SpamAssassin Rule: CHARSET_FARAWAY

Standard description: Character set indicates a foreign language

Explanation

The content of the mail is in a character set not permitted by the value of the ok_locales configuration setting.

And, again, please use [code] tags when posting code or output, for easier readability. Thank you.

mfoley · 05-03-2019, 07:58 PM

Here's the entire head of one of the messages (and part of the body). The Subject is in the header, right? Seems to me that Russian characters should trap on at least the CHARSET_FARAWAY_HEADER rule. This message was from a bounce that some spammer used spoofing as one of our users.

If I'm not using the correct rule to trap this charset, what would be the correct rule?

Code:

From  Thu May  2 13:28:39 2019
Return-Path: <>
Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04hn2101.outbound.protection.outlook.com [52.101.138.101])
        by mail.hprs.local (8.15.2/8.15.2) with ESMTPS id x42HSSYh026689
        (version=TLSv1.2 cipher=AES256-SHA256 bits=256 verify=OK)
        for <dblosser@ohprs.org>; Thu, 2 May 2019 13:28:30 -0400
X-Virus-Status: Clean
X-Virus-Scanned: clamav-milter 0.99.2 at mail
Authentication-Results: mail.hprs.local;
        dkim=fail reason="signature verification failed" (1024-bit key) header.d=narfu.onmicrosoft.com header.i=@narfu.onmicrosoft.com header.b=gLRv7mkz
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=narfu.onmicrosoft.com;
 s=selector1-narfu-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=2C4SVYYlxhuzTXZCX127//qEFQjidTVi9jZEbh6Jfhw=;
 b=gLRv7mkzWLxcqoXHsGsaV25LAJ9bt2dFlyaT2LbxyfGEwudyPKmmxEcLILQZv7rsUEZ4Jy0o0kc608167nYunVUXuLoZXixLZvkju7Dn6fs2ir+bKyPTkGjJcsG2TtOjFRDRO+AFp9YIfnSJ76DepncXbXU0NcKSo3BLoX4RhAA=
MIME-Version: 1.0
From: <postmaster@narfu.onmicrosoft.com>
To: <dblosser@ohprs.org>
Date: Thu, 2 May 2019 17:27:24 +0000
Content-Type: multipart/report; report-type=delivery-status;
        boundary="9ef5cc7c-2879-405a-87ae-1f81816b29a5"
X-MS-Exchange-Message-Is-Ndr:
Content-Language: en-US
Message-ID:
 <5199de0e-9f31-4ee0-9e3f-047f2ffbe3c5@HE1PR0202MB2585.eurprd02.prod.outlook.com>
In-Reply-To: <144b5676e-da854e2f@ohprs.org>
References: <144b5676e-da854e2f@ohprs.org>
Subject:
 =?utf-8?B?VW5kZWxpdmVyYWJsZTog0JvQuNGH0L3QvtC1INC00LvRjyDRgtC10LHRjw==?=
Auto-Submitted: auto-replied
X-MS-PublicTrafficType: Email
X-MS-TrafficTypeDiagnostic: HE1PR0202MB2585:
X-MS-Exchange-PUrlCount: 5
X-Microsoft-Antispam-PRVS:
        <HE1PR0202MB258582420B006AD4F57AB654B8340@HE1PR0202MB2585.eurprd02.prod.outlook.com>
X-MS-Oob-TLC-OOBClassifiers: OLM:9508;
X-Forefront-PRVS: 0025434D2D
X-Forefront-Antispam-Report:
        SFV:NSPM;SFS:(10019020)(50650200002)(39860400002)(346002)(366004)(396003)(136003)(376002)(189003)(199004)(1930700014)(53946003)(476003)(224303003)(1476002)(2351001)(2876002)(10126004)(486006)(2906002)(83832001)(68736007)(16586007)(78352004)(6346003)(590304002)(6916009)(31696002)(316002)(498600001)(81166006)(53936002)(86442003)(1706002)(86902001)(42186006)(11286001)(66576008)(66946007)(52396003)(6306002)(14444005)(76176011)(71190400001)(86152003)(446003)(64872007)(66574012)(11346002)(236005)(786003)(78496005)(5320300001)(5660300002)(9686003)(606006)(733005)(81156014)(52230400001)(30864003)(42882007)(89962001)(84326002)(33964004)(74316002)(73956011)(82146005)(579004)(569006);DIR:OUT;SFP:1501;SCL:1;SRVR:HE1PR0202MB2585;H:;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;MX:0;A:0;
Received-SPF: None (protection.outlook.com:  does not designate permitted
 sender hosts)
Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=<>; 
X-MS-Exchange-SenderADCheck: 1
X-Microsoft-Antispam-Message-Info:
        tm8V65RrqWdhOWAkvdMAySjMzEpiiHxiP0P1xzJ5DRZHnQpz7czKszfWj0Suw7SA9jGXVxbjI/3L09tyh3hrLxNauEL26sf787O5a514Gq/c6ze9K7oqDBAumMm7mfgs/hFdFnxUVfIL7nBeYV4vAvjApT/o1kgV1NqihmGsARbBk6ikoSvAe/okA3iDKgDQSt9wbNtIm2MZicWa3JVTqCvOeJoA6e+36SVP+hg+GdLSlV6jADzV6VWes70gwm8w9xYx5giGch99ohN87upgyDDr8sfuTM6T7TSAez9sHfOaVUkwC8qNjocbzwB3Afah/TgNHrBsbZY//I/guLbplXwwsZ7Qa26O4squHgCuKpM2doWAg8DspX7n59qNSkiYOAO26fkuLBi7Nq9qOBXsWFfEey8F6/2QHRJ17l2C19s=
X-OriginatorOrg: narfu.onmicrosoft.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 May 2019 17:27:24.7363
 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Network-Message-Id:
        438ee25f-a95c-4748-5ee5-08d6cf237113
X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0202MB2585
X-Spam-Status: No, score=0.2 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED,
        HTML_MESSAGE autolearn=no autolearn_force=no version=3.4.1-_revision__1.26__
X-Spam-Report: 
        *  0.0 HTML_MESSAGE BODY: HTML included in message
        *  0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily
        *      valid
        *  0.1 DKIM_INVALID DKIM or DK signature exists, but is not valid
X-Spam-Checker-Version: SpamAssassin 3.4.1-_revision__1.26__ (2015-04-28) on
        mail.hprs.local
Status: R

--9ef5cc7c-2879-405a-87ae-1f81816b29a5
Content-Type: multipart/alternative; differences=Content-Type;
        boundary="b69709ca-89c1-42a7-b522-69c04d1339aa"

--b69709ca-89c1-42a7-b522-69c04d1339aa
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

[http://products.office.com/en-us/CMSImages/Office365Logo_Orange.png?versio=
n=3Db8d100a9-0a8b-8e6a-88e1-ef488fee0470]
Your message to apaid@agtu.ru couldn't be delivered.

apaid wasn't found at agtu.ru.

dblosser        Office 365      apaid
Action Required                 Recipient


Unknown To address

buttugly · 05-03-2019, 08:51 PM

Heres a couple links I found while looking around. Dunno if it helps or not.

https://discussions.apple.com/thread/807776
http://spamassassin.1065346.n5.nabbl...s-td54131.html

By that, you disabled Russian and Korean detection, err, guessing.

The TextCat plugin thus will never detect any mail written in either
Russian or Korean -- and subsequently will not be able to tell it is an
UNWANTED_LANGUAGE_BODY, because it didn't guess any language.

In other words: TextCat will not identify text to be written in the
languages specified by inactive_languages, because you told it to. If it
can't guess a language with sufficient confidence, it can't be sure it
is outside the list of ok_languages either.

scasey · 05-03-2019, 09:02 PM

The body of the email is in English, and the subject header is encoded English, so there's nothing there to trigger the CHARSET_FARAWAY rule. Again, the _HEADER rule only checks to see if the LANG is different than the body.

Here's a rule I wrote to mark encoded Subjects:

Code:

header UTF_8_RULE Subject =~ /\?UTF\-8\?/
score UTF_8_RULE 10
describe UTF_8_RULE look for UTF-8 in subject and mark as spam

With a score of 10, I never see them. It's in my ~/.procmailrc, so not a server-wide rule.

This bounce was from outlook.com. Is that how you send email? If not, you could develop a rule that says a BOUNCE of mail that didn't originate on your server gets a high spam score.

I never see bogus BOUNCES, but I'm not recalling why. Something I programmed a hundred years ago. I'll post it when I find/remember it. I definitely toss DOUBLE BOUNCES (spam to bogus address on my server, bounced to sender, who bounced it back...I put those in the bit bucket)

mfoley · 05-06-2019, 11:41 AM

buttugly: Thanks for those links, but I believe scasey is pointing out that the message body/header DOES NOT specify a foreign or non-English charset.

scasey: So, perhaps I'm stuck on this one? I used to trap on UTF-8 subject encoding, but many legitimate senders' messages are now UTF-8 encoded in the subject, so I had to start letting those through.

No, we do not send mail via outlook.com. We have a self-hosted mail server. The probable scenario for this messages was that someone in Kreblekistan (making that up) sent a bunch of SPAM all over, masquerading as one of our users. In this case, the spammer's message to apaid@agtu.ru was to a non-existent user; hence the bounce. Not sure how I could trap that since the real user on our domain could send a message to a non-existent user somewhere. I don't see anything in this message that would distinguish the real user from the spammer/spoofer. I've added the following rule to local.cf. It should help at least with bounces from Russian email addresses.

Code:

body LOCAL_LOOP_RUSSIA          /mail for *.ru loops back|Your message to *.ru couldn't be delivered/i
score LOCAL_LOOP_RUSSIA         5.0

scasey · 05-06-2019, 11:55 AM

The fact that the message being bounced didn't originally come from your server would be a clue to mark it as spam (or delete it).
If your user mis-typed an address, the bounce would reflect that the original message DID come from your server, so you'd allow that to go through.

To accomplish the former, you need to test for a bounce

Code:

Return-Path: <>

and that the original message did not come from your server. You'd have to look at "legitimate" bounces to see what those (ones that DID originate from your server) look like.

Your rule might help, if the spam was originally sent to Russian email, but I'd include a test for the bounce as well.

Still trying to find/remember why I seldom see invalid BOUNCEs...I see legit ones all the time.

Edit: Just checked a "legit" BOUNCE, the report came from my server!
That is, the actual BOUNCE message was from my server, not from the remote server where the address was invalid. So it would be as simple as testing the contents of the From: header

Your posted example said

Code:

From: <postmaster@narfu.onmicrosoft.com>

That's not your server, so the BOUNCE is bogus, and can be marked as spam or rejected.
If you create a rule setting the points high enough, the bounce will be returned to (in this case, Microsoft) who will probably just delete it as a double-bounce (which it then would be). I route all double-bounces to the bit bucket.

You'll need to confirm that your "legit" bounces come from your server, but I believe that's how all MTAs work:

Try to deliver
Get rejected
Report the rejection to the sender

This is completely untested!

Code:

header __TEST1 Return-Path = /<>/
header __TEST2 From != /myserver.com/
meta BOGUS_BOUNCE ( __TEST1 && __TEST2 )
score BOGUS_BOUNCE 5
describe BOGUS_BOUNCE Mark bounces not from my server as spam

mfoley · 05-10-2019, 09:47 AM

scasey: Getting more of these today. I'll try your rule.