LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-19-2011, 11:01 AM   #1
winairmvs
Member
 
Registered: Aug 2009
Posts: 42

Rep: Reputation: 16
search for list of usernames in syslog quickly


Hello, I am trying to find matching usernames from the passwd file in a syslog for dovecot. My current script loops through the passwd file and greps for that username in the syslog file finding the first match then moving to the next username. The passwd file is about 6000 lines long, so you can imagine this is taking forever to complete. I am wondering if there is a way to search for all the usernames in one grep statement, or a more efficient way to do this?

Here is what I have so far:

#!/bin/zsh

base="/usr/local/admin/report"
passwd="/etc/passwd"

userlist=$(cat ${passwd} | cut -d":" -f1)

IFS="
"

echo "" > ${base}/tmpgrep

for user in `echo ${userlist}`
do
grep -m 1 "${user}" /syslog/dovecot/maillog >> ${base}/tmpgrep

done
 
Old 10-19-2011, 11:40 AM   #2
jthill
Member
 
Registered: Mar 2010
Distribution: Arch
Posts: 211

Rep: Reputation: 67
Quote:
Originally Posted by winairmvs View Post
Hello, I am trying to find matching usernames from the passwd file in a syslog for dovecot.
Code:
$ cut -f1 -d: /etc/passwd | grep -F -f- /syslog/dovecot/maillog > $base/tmpgrep
 
Old 10-19-2011, 11:58 AM   #3
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,606

Rep: Reputation: 448Reputation: 448Reputation: 448Reputation: 448Reputation: 448
Quote:
Originally Posted by jthill View Post
Code:
$ cut -f1 -d: /etc/passwd | grep -F -f- /syslog/dovecot/maillog > $base/tmpgrep
Hi,

basically a good idea. But it returns all matches if the pattern is matched multiple times. From the OP's example I take that he wants every name only printed once. A small modification:
Code:
cut -f1 -d: /etc/passwd|xargs -I{} grep -m 1 '{}' /syslog/dovecot/maillog > $base/tmpgrep
 
Old 10-19-2011, 12:10 PM   #4
winairmvs
Member
 
Registered: Aug 2009
Posts: 42

Original Poster
Rep: Reputation: 16
Quote:
Originally Posted by jthill View Post
Code:
$ cut -f1 -d: /etc/passwd | grep -F -f- /syslog/dovecot/maillog > $base/tmpgrep
Cool, I didn't know I could do that with grep. Unfortunately, this grep is matching every line in the maillog file. I was reading the man page for grep and the -F options reads:

Treats each specified pattern as a string instead of a regular expression. A NULL string matches every line.

I am assuming it's getting back a null string and matching everything?

Last edited by winairmvs; 10-19-2011 at 12:31 PM.
 
Old 10-19-2011, 12:25 PM   #5
winairmvs
Member
 
Registered: Aug 2009
Posts: 42

Original Poster
Rep: Reputation: 16
Quote:
Originally Posted by crts View Post
Hi,

basically a good idea. But it returns all matches if the pattern is matched multiple times. From the OP's example I take that he wants every name only printed once. A small modification:
Code:
cut -f1 -d: /etc/passwd|xargs -I{} grep -m 1 '{}' /syslog/dovecot/maillog > $base/tmpgrep
crts, thanks for the updated code. Unfortunately this script runs very slowly, maybe even slower than how I was originally doing it.
 
Old 10-19-2011, 01:40 PM   #6
jthill
Member
 
Registered: Mar 2010
Distribution: Arch
Posts: 211

Rep: Reputation: 67
Yes, I missed the -m1 part, my apologies.

Here's an awk-builder:
Code:
$ cut -f1 -d: /etc/passwd| sed 's,.*,/\\<&\\>/ \&\& !saw["&"] { saw["&"]=1; print },' > findthem
$ awk -f findthem /syslog/dovecot/maillog > $base/tmpgrep
I added word-boundary testing (\< and \>) while fixing it up.

I've tested this with 'awk -f findthem /etc/passwd /etc/passwd' and it works and also shows a weakness: some userids are also common words. It prints the root line twice because it matches root and also matches bin. It'd be easy enough to fix it so it prints a line only once no matter how many hits you get, but that won't help with the false matches in the real logs.

I don't have 6000 users. I tried it with apt-cache pkgnames output against the apt logs, awk took a few seconds and a few hundred meg compiling 35410 tests but did the job just fine.
 
Old 10-19-2011, 01:47 PM   #7
jthill
Member
 
Registered: Mar 2010
Distribution: Arch
Posts: 211

Rep: Reputation: 67
... forgot to include the print-a-line-only-once alternative, haste makes waste, I knew that, really ...
Code:
$ echo '{ printit=0 }' > findthem
$ cut -f1 -d: /etc/passwd| sed 's,.*,/\\<&\\>/ \&\& !saw["&"] { saw["&"]=1; printit=1 },' >> findthem
$ echo 'printit { print }' >> findthem
$ awk -f findthem /syslog/dovecot/maillog > $base/tmpgrep
 
Old 10-20-2011, 03:51 PM   #8
jthill
Member
 
Registered: Mar 2010
Distribution: Arch
Posts: 211

Rep: Reputation: 67
Here's an actually reasonable solution using GNU grep's --color=always.

Here's firstfind.awk:
Code:
# This awk postprocesses `grep --color=always` output, eliminating duplicate hits
BEGIN{FS="\0"}
{
	n=split($0,f,/\033\[(01;31)?m\033\[K/);
	printit=0
	for (i=2; i<n; i+=2) {
		if (!seen[f[i]]) {
			printit=1;
			break;
		}
	}
	if ( printit ) {
		text=f[1]
		for (i=2; i<n; i+=2) {
			if (!seen[f[i]]) {
				seen[f[i]]=1;
				f[i]="\033[01;31m"f[i]"\033[m"
			}
			text=text""f[i]""f[i+1]
		}
		print text;
	}
}
and you feed it like so:
Code:
$ cut -f1 -d: /etc/passwd >userids
$ grep -wFf userids --color=always your-logfile-here | awk -f firstfind.awk
This handles scanning /var/log/apt/* for the first hits on apt-cache pkgnames (35410 names) very nicely.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Need advice on a script to search many files for list of terms, append hits to list jimmy the saint Programming 1 07-11-2010 04:59 AM
Search with list/menu tommytomato Programming 1 04-02-2010 09:25 AM
Quickly view a list of active services vharishankar Linux - General 19 04-02-2005 10:23 AM
How can I search a mailing list? bolinux General 3 11-08-2003 11:27 PM
List controls for search items... Thymox LQ Suggestions & Feedback 2 05-21-2002 09:16 AM


All times are GMT -5. The time now is 06:41 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration