Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I cannot figure out why in this case "grep -vf" is not returning the full list of items that are in file2 but not in file1. It only returns 2 of the 4 items in file2. Checked and there are no dupes but there are partial match of one item to the another item, ie(making up one): MINI and MINIMUM. I have several scripts that are using "grep -vf" and would like to understand why it's not working here. I'd really, really appreciate if someone can explain why.
$ awk 'FNR==NR {hash[$0]; next} !($0 in hash)' file1.txt file2.txt
KNIGHTKNM1LSET
PDQATSLSET1
PDQIOI
CITADELCDG1LSET
$
$
As you can see with "grep -vf" two items(PDQATSLSET1 and CITADELCDG1LSET) are not matched. Why??
$ grep -vf file1.txt file2.txt
KNIGHTKNM1LSET
PDQIOI
$
After further testing, looks like when I create a new file out of file1.txt with the first 40 items(lines) and run the same "grep -vf" command: grep -vf file40.txt file2.txt, it returns all 4 items:
$ head -40 file1 > file40.txt
$ grep -vf file40.txt file2.txt |egrep "PDQATSLSET1|KNIGHTKNM1LSET|CITADELCDG1LSET|PDQIOI"
KNIGHTKNM1LSET
PDQATSLSET1
PDQIOI
CITADELCDG1LSET
$
Again, thx for anybody trying to help me with this.
Using grep with the -f option means that it fetches patterns from the file. So if there are any periods, asterisks, or other relevant metacharacters in the first file, then that would throw off the results.
As noted above, -f is the shortcut for --file, and unless specified otherwise that pattern will be treated as a regex. The --fixed-strings option (shortcut -F) can be used to treat the pattern as a series of strings.
The mention of partial matches also suggests that either --word-regexp or --line-regexp would be useful options. (Despite the poorly-chosen names, these apply irrespective of whether pattern is a regex or not).
obviously what you posted is not enough to answer. Better would be to give a complete example, not only a few lines from here and there. I don't really want you to post everything, a minimal example would be helpful, containing file1 and file2, your commands and your expectations.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.