ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hello All,
Hope someone might be able to point me in the right direction with this problem. I'm trying to generate a report using a bash script that lists out a pile of email addresses and the amount of times they appear in a log file (Spammers) but I only want to search for these email addresses by the 'domain name'.
eg. I have the following email addresses in my log file:
which doesn't really work as it assumes that leaving out the first set of characters before the first period is enough... which it ain't! I need to work from the end and work backwards. Any ideas??
I'm new to all this so can you explain the bash string handler to me? or just a name of the command or something.
The recomp looks a bit complex for me to use... going by it's man page.
The amount of records I'm sorting through is roughy 1.3 million, so I can't hold it all in memory, I must dump everything into files during the whole process.
Fair enough.
I read through the awk page and tried using awk -F. '{print $fieldno}' to seperate out the email addresses into different fields. However since the amount of actually 'fields' vary from address to address I'm kinda back to square one. Is there something handy that will allow me to go directly to the last field for each email address and work backwards from there.
Anytime there are fewer than 5 "parts" the trailing variables will be null.
dmname has the name of the domain, plus the hostname sometimes.
var1...var5 parse out each component so you can use them.
Well thanks for all the advice, I've managed a way to do what I need to do. However the code isn't the best and can crash out under certain circumstances but here it is:
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.