LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Need some form of grep (https://www.linuxquestions.org/questions/linux-software-2/need-some-form-of-grep-626387/)

dougp23 03-07-2008 07:37 AM

Need some form of grep
 
I am trying grep, egrep, fgrep, but am not getting the results I need.

I need to search our sendmail server, which archives a copy of every mail coming in and out. So it stores many thousands of emails! They are named according to a date stamp, but that's irrelevant. What I need is to find all emails between certain days, that have BOTH jsmith and twalker in them (we are trying to trace all emails these two have had with each other).

Right now, using the greps, I can get files returned that have EITHER, but I only want returns where BOTH are in the file.

Anyone?

stzein 03-07-2008 08:11 AM

I think you'll need a little script; something like this:

Code:

#!/bin/bash
for file in $(find /path/to/emails -type f); do
grep "jsmith" "$file" &> /dev/null && grep "twalker" "$file" &> /dev/null && echo "$file"
done


Make sure to fill in the correct path to your mails and modify the options for find to narrow down the results.
Then save this in a text file "findmails.sh" and execute with "sh findmails.sh". It will return the filenames of the files that have both "jsmith" and "twalker" in them.

Uncle_Theodore 03-07-2008 08:13 AM

The thing you're trying to do seems like a little invasion of privacy... Nevertheless.

Why don't you try

grep jsmith /path/* | grep twalker
?

pixellany 03-07-2008 08:18 AM

grep works line by line---ie it returns the LINE which contains the keyword. In your case, the two keywords can be on different lines.

One crude way to do this is to have grep return enough lines of context so that the address fields are all included. Looking at a typical header, it seems that ~ 10 lines should work. So, something like this should work:

grep -C5 jsmith filename | grep twalker > newfilename

You may need to include more lines of context to get the date stamp, messageID, etc.

man grep---look at the -A, -B, and -C flags

dougp23 03-07-2008 08:23 AM

Hey Uncle,

Well being the IT Administrator of a company, I am often asked to do things that I frown upon. However, this is just one of those "I never got an email saying this" type things. So both parties have asked me to find if the email was ever sent.

Your grep with the pipe is close to what I want, but wouldn't you know it, the second email address is often a "cc" which puts it on a separate line in the file, so I never get any mathces for both...

I am going to try stzein's idea.

Thanks!

Nathanael 03-07-2008 08:31 AM

why are you making this so hard for yourself?
tell the one who claims to have the email to give you the message id that is in the emails source code

then simply grep for that id :-)

Edit: some mail servers save the mails with that id as the name of the file, so a locate or find will do the job.

akhorus 03-07-2008 08:36 AM

One option... not an expert
 
Hi there, I'm no expert with scripts but think you could try something... until some expert replies ;-)

I would create a list with, let's say, the ids of the mails where ONE name appears. Then another list with the mails where the OTHER name appears...

Then I would create an easy little program (in python, perl, Haskell...) which returns the intersection of both lists...

That, of course, if you can program at least a bit...

Good luck!!

dougp23 03-07-2008 09:00 AM

Thanks everyone! Pixellany's solution was the quickest and easiest, and it worked! Found the 'offending' email, lol!!

Thanks again!


All times are GMT -5. The time now is 06:31 AM.