LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Blogs > Michael Uplawski
User Name
Password

Notices


Rate this Entry

Email Filter With Procmail: Filter Received-Headers By IP-Range

Posted 07-12-2021 at 03:02 PM by Michael Uplawski
Updated 07-30-2021 at 04:38 PM by Michael Uplawski (missing words and letters ... this no good, wrong words, too. Even worze.)

A styled version of this text in French: http://www.uplawski.eu/articles/Linu..._Procmail.html


Introduction
I have not found much insight on the Web about the feasibility of an idea that I have had many years ago, but then had forgotten for a while:
Filter email by Received-header, if the IP-address in this header belongs to a specific range of IP-addresses.

This is nothing revolutionary, but I lacked knowledge and could not find a way to configure such a filter with the software at my disposal. The discovery of online-services which transform IP-Ranges into Regular Expressions revived my interest in the topic.

Here is the thread that I have started on LQ and which helped me get started, finally, by first abandoning the idea to work with regular expressions: [Regex matching IP-Ranges] looking for alternative software solution

I write this blog-entry because
  • my currently working solution may be the only one, explained in detail (here, in this blog)
  • all I do may be wrong anyway
  • you might have a better idea and could react
Why?
I want to filter IP-ranges, because
  • I receive the almost same kind of SPAM from always the same Internet domain
  • My Bayesian Filters are not efficient against this type of fraudulent mail, because I have white-listed too many of its attributes in the past, meaning that they will pass the Bayesian Filters unharmed until I have signaled a lot of them as SPAM again.
  • For months, my reports to the abuse-department of the company, owning the domain, could not change anything
  • I do not expect anything valuable to come from there
  • I like the idea to prevent part of the Internet to annoy me
I hope this is sufficient as an answer, if not, just consider me dumb that way.

I do not advocate filters against whole IP-ranges as an efficient way to fight SPAM. I do not even think that my procedure could be widely adopted to improve our “Email-experience”... And I cannot know why you are interested in this blog-entry...
The IP-ranges that I am talking about
216.58.192.0 - 216.58.223.255
This is one such IP-range, and this is the one attributed to Google.com. You will normally not encounter any of the IPs in this block of addresses in your mail-headers. I use it as an example.

To find this information, I first executed a simple Ping on google.com:
Code:
slarti@magrathea:~$ ping google.com
Then I did a “whois” on the pinged address:
Code:
slarti@magrathea:~$ whois 216.58.214.78
The result is an entry from the ARIN database, listing a lot of information about the owner of this one IP-address, but the very first line below the introductory comment is already
NetRange: 216.58.192.0 - 216.58.223.255

Part of this procedure can be used to discover the IP-range that any registered IP-address belongs to or the IP-range which is associated to an Internet-Domain.
Received-Headers in Email
Every mail that you receive will contain two or more Received-headers. In their entirety they – normally – describe the path which the mail has taken from the sender to your mail-server. This is – in fact – the only way to deduce the true origin of an email, when you only have the message to look at. Here is a page which talks about how to read these headers in the context of SPAM fighting:
https://rig.cs.luc.edu/~rig/home/bin/mail/forging.html
When you can trace back an email to the original sender's mail-service, you know the IP-address of the mail-server employed.
This is, as an example, the Received-header which identifies GitHub as source of an automated message that I have received (for the example not completely authentic):
Code:
Received: from o4.sgmail.github.com (o4.sgmail.github.com [192.254.112.99])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
        (No client certificate requested)
        by [my smarthost] (Postfix) with ESMTPS id 5F909100C64
        for <me@my_domain>; Wed, 12 Mai 2004 06:53:40 +0200 (CEST)
Note that this is neither the first, topmost nor the last Received-header in the message. To find out if 192.254.112.99 is really the injecting server, you must analyze the remaining headers, too.
Procmail
Procmail is a program which filters incoming mail in any conceivable way. I use it to distribute some emails to dedicated folders or to delete unwanted mail, based on the properties of those messages that Procmail is able discern.
The rules which define the filters are assembled in the configuration file to procmail, usually
Code:
~/.procmailrc
If you have procmail installed on your system, there are man pages for the procmail program (man procmail), the configuration file (man procmailrc), filter examples (man procmailex) and an advanced scoring technique, called “weighted scoring technique” (man procmailsc).
Is a given IP-address part of a specific IP-range?
This is about the only “serious” problem to resolve, before a mail-filter can act on IP-ranges. Once a solution is found, all that is needed is a list of IP-ranges to filter and a Procmail-recipe to apply the new filter to any Received-header found in your incoming mail.

My original idea had been a Regular Expression, matching any IP-address which belongs to a given IP-range. This is a very complicated approach and – as it turned out –, complete overkill: An IP-address is just an awkward way to note a number. An IP-range thus simply names the extreme values of a numerical series. Other numbers are in the series or not.
The answer to the question “How do I find out if an address is part of a specific IP-range” has been given by pan64 in the very first reaction to my thread here on LQ:
  • implement a function to validate and convert ipv4 [s1.s2.s3.s4] to a number: ((s1*256+s2)*256+s3)*256+s4
  • use numerical comparison to know if an ip is "in range"
In a programmed routine, you can use library-functions to identify and verify IP-addresses and probably to compare them. The simple arithmetic, used in the above proposition, fits however in a 1-liner and appears to be sufficient for the purpose of filtering ip-ranges.

I have written a ruby-script “ip_in_range” (a Ruby-Gem on rubygems.org) to do the actual comparison. It can be invoked in different ways with a varying number of program arguments, but for the integration in a mail-filter, the syntax is
Code:
ip_in_range < [email] [range_list.txt]
I can pipe-in the email-message to filter and name a text-file as only program-argument to ip_in_range. This text-file contains a simple list of IP-ranges, one per line, like:
Code:
192.168.0.1 192.168.0.255
Some Evil Exemplary Range: 192.168.2.100 192.168.2.168
(...)
Text outside the IP-addresses is ignored by ip_in_range and can be used to comment an entry.
The Procmail Recipe
Procmail can delegate tasks to external programs, either to react to a matching filter or to test a condition which depends on the exit code of a program. Such call of a program in the condition of a Procmail-filter is initiated with the ? flag.
The recipe which tests if the Received-headers of a mail contain any IP-address from a range that I want to filter:
Code:
:0
* !FROM_DAEMON
* !FROM_MAILER
* !^X-Loop: my_mail@address
* 1^0 ? ip_in_range ~/.procmail/range_list.txt
/home/[path to my mail-folder]/refused/ip_refused
This way, all mail which matches the filter will be written to a mail-folder “ip_refused” in the sub-directory “refused” of my mail-folder.

When you define your own procmail-filters, you should test them by feeding a test-mail to Procmail. Ideally, you write the first edition of a new filter to a separate configuration file and name it as program-argument to Procmail. This way, your working configuration will not contain potentially misbehaving recipes, for as long as it takes you to perfect them.

Ω
Views 292 Comments 0
« Prev     Main     Next »
Total Comments 0

Comments

 

  



All times are GMT -5. The time now is 01:06 AM.

Main Menu
Advertisement
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration