LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-15-2021, 05:33 PM   #1
amateurscripter
Member
 
Registered: Nov 2011
Posts: 41

Rep: Reputation: Disabled
Red face grep -vf difference or vs awk hash


I cannot figure out why in this case "grep -vf" is not returning the full list of items that are in file2 but not in file1. It only returns 2 of the 4 items in file2. Checked and there are no dupes but there are partial match of one item to the another item, ie(making up one): MINI and MINIMUM. I have several scripts that are using "grep -vf" and would like to understand why it's not working here. I'd really, really appreciate if someone can explain why.

As you can see file2.txt has 4 more items(lines):

$ wc -l file1.txt file2.txt
91 file1.txt
95 file2.txt
186 total
$

$ awk 'FNR==NR {hash[$0]; next} !($0 in hash)' file1.txt file2.txt
KNIGHTKNM1LSET
PDQATSLSET1
PDQIOI
CITADELCDG1LSET
$
$
As you can see with "grep -vf" two items(PDQATSLSET1 and CITADELCDG1LSET) are not matched. Why??
$ grep -vf file1.txt file2.txt
KNIGHTKNM1LSET
PDQIOI
$
After further testing, looks like when I create a new file out of file1.txt with the first 40 items(lines) and run the same "grep -vf" command: grep -vf file40.txt file2.txt, it returns all 4 items:

$ head -40 file1 > file40.txt
$ grep -vf file40.txt file2.txt |egrep "PDQATSLSET1|KNIGHTKNM1LSET|CITADELCDG1LSET|PDQIOI"
KNIGHTKNM1LSET
PDQATSLSET1
PDQIOI
CITADELCDG1LSET
$
Again, thx for anybody trying to help me with this.
 
Old 10-15-2021, 06:00 PM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,128

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Reverse the order of the files in the grep run.
 
Old 10-15-2021, 09:05 PM   #3
amateurscripter
Member
 
Registered: Nov 2011
Posts: 41

Original Poster
Rep: Reputation: Disabled
Thx Syg00 but I tried that before. It doesn’t return anything

Last edited by amateurscripter; 10-15-2021 at 09:53 PM.
 
Old 10-15-2021, 10:29 PM   #4
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,309
Blog Entries: 3

Rep: Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721
Try diff also.

Code:
diff file1.txt file2.txt

diff <(sort file1.txt) <(sort file2.txt)
The basic form alone ought to do it, but be sure to check all the options for it as they can make it display a lot of useful material.

Last edited by Turbocapitalist; 10-15-2021 at 10:31 PM.
 
Old 10-15-2021, 10:51 PM   #5
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,309
Blog Entries: 3

Rep: Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721Reputation: 3721
Using grep with the -f option means that it fetches patterns from the file. So if there are any periods, asterisks, or other relevant metacharacters in the first file, then that would throw off the results.
 
Old 10-16-2021, 10:29 AM   #6
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,599

Rep: Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546

As noted above, -f is the shortcut for --file, and unless specified otherwise that pattern will be treated as a regex. The --fixed-strings option (shortcut -F) can be used to treat the pattern as a series of strings.

The mention of partial matches also suggests that either --word-regexp or --line-regexp would be useful options. (Despite the poorly-chosen names, these apply irrespective of whether pattern is a regex or not).


Last edited by boughtonp; 10-16-2021 at 10:32 AM.
 
Old 10-16-2021, 12:33 PM   #7
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,849

Rep: Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309
obviously what you posted is not enough to answer. Better would be to give a complete example, not only a few lines from here and there. I don't really want you to post everything, a minimal example would be helpful, containing file1 and file2, your commands and your expectations.
 
  


Reply

Tags
awk, grep



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Is the hash in "rootpw --iscrypted HASH" in Kickstart encrypted with md5? Rogue45 Linux - Newbie 1 08-01-2012 12:46 AM
Perl Hashes -- Updating a hash ref via hash value 0.o Programming 5 06-05-2012 12:45 PM
Perl Hash of Hash reference query kdelover Programming 1 02-19-2011 04:47 AM
need help unpacking hmac-md5 hash into md5 hash lynx5 Programming 3 02-02-2008 04:06 PM
Using hash value as key for other hash in Perl scuzzman Programming 6 02-14-2006 05:08 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:23 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration