search for list of usernames in syslog quickly
Hello, I am trying to find matching usernames from the passwd file in a syslog for dovecot. My current script loops through the passwd file and greps for that username in the syslog file finding the first match then moving to the next username. The passwd file is about 6000 lines long, so you can imagine this is taking forever to complete. I am wondering if there is a way to search for all the usernames in one grep statement, or a more efficient way to do this?
Here is what I have so far: #!/bin/zsh base="/usr/local/admin/report" passwd="/etc/passwd" userlist=$(cat ${passwd} | cut -d":" -f1) IFS=" " echo "" > ${base}/tmpgrep for user in `echo ${userlist}` do grep -m 1 "${user}" /syslog/dovecot/maillog >> ${base}/tmpgrep done |
Quote:
Code:
$ cut -f1 -d: /etc/passwd | grep -F -f- /syslog/dovecot/maillog > $base/tmpgrep |
Quote:
basically a good idea. But it returns all matches if the pattern is matched multiple times. From the OP's example I take that he wants every name only printed once. A small modification: Code:
cut -f1 -d: /etc/passwd|xargs -I{} grep -m 1 '{}' /syslog/dovecot/maillog > $base/tmpgrep |
Quote:
Treats each specified pattern as a string instead of a regular expression. A NULL string matches every line. I am assuming it's getting back a null string and matching everything? |
Quote:
|
Yes, I missed the -m1 part, my apologies.
Here's an awk-builder: Code:
$ cut -f1 -d: /etc/passwd| sed 's,.*,/\\<&\\>/ \&\& !saw["&"] { saw["&"]=1; print },' > findthem I've tested this with 'awk -f findthem /etc/passwd /etc/passwd' and it works and also shows a weakness: some userids are also common words. It prints the root line twice because it matches root and also matches bin. It'd be easy enough to fix it so it prints a line only once no matter how many hits you get, but that won't help with the false matches in the real logs. I don't have 6000 users. I tried it with apt-cache pkgnames output against the apt logs, awk took a few seconds and a few hundred meg compiling 35410 tests but did the job just fine. |
... forgot to include the print-a-line-only-once alternative, haste makes waste, I knew that, really ...
Code:
$ echo '{ printit=0 }' > findthem |
Here's an actually reasonable solution using GNU grep's --color=always.
Here's firstfind.awk: Code:
# This awk postprocesses `grep --color=always` output, eliminating duplicate hits Code:
$ cut -f1 -d: /etc/passwd >userids |
All times are GMT -5. The time now is 06:45 AM. |