Download your favorite Linux distribution at LQ ISO.
Go Back > Forums > Linux Forums > Linux - Newbie
User Name
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!


  Search this Thread
Old 05-21-2012, 10:30 AM   #1
LQ Newbie
Registered: May 2012
Posts: 11

Rep: Reputation: Disabled
Diff 2 files (not line by line)

Hi guys,

I have 2 text files with very similar contents, but in jumbled order. Does anyone know a good way to compare the lines of one to (all) the lines in the other?

File1 contains:

File2 contains:

Because diff compares these 2 files line by line it sees them as different - this is true, but I only want to see words that are missing from the document altogether.

Thanks in advance!
Old 05-21-2012, 10:41 AM   #2
LQ Guru
Registered: May 2005
Location: boston, usa
Distribution: fedora-35
Posts: 5,309

Rep: Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918Reputation: 918
can you sort them then compare them ?

edit: else grep -f mite work ? ... or something like
for word in `cat file-1.lst`
 grep $word file-2.lst

Last edited by schneidz; 05-21-2012 at 10:43 AM.
Old 05-21-2012, 08:37 PM   #3
LQ Newbie
Registered: May 2012
Posts: 11

Original Poster
Rep: Reputation: Disabled
Thanks schneidz!

That finally got my brain off the diff command.
Looks pretty simple and sweet, I'll give it a shot.
Old 05-22-2012, 12:13 PM   #4
David the H.
Bash Guru
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Note that it's not generally a good idea to read lines of input from a file or command with a for loop. You should generally use a while+read loop instead.

Although in this particular case the expansion splitting the file into individual words results in the desired behavior. If the file were very large, however, it could possibly overwhelm the capacity of the terminal, as the whole list gets expanded before the for loop is run.

As mentioned, grep can also be used to test one file against another (on a per-line basis). This command prints every line in file2 that does not exist in file1:

grep -v -f file1.txt file2.txt
Just run the command again with the files reversed to get all the unique lines in file1.

Last edited by David the H.; 05-22-2012 at 12:14 PM.
Old 05-22-2012, 07:19 PM   #5
LQ Newbie
Registered: May 2012
Posts: 11

Original Poster
Rep: Reputation: Disabled
Thanks David,

I ended up using that exact grep command yesterday, but the files I was using seemed to be too large for it to handle. An strace on the process showed no activity at all. When I split the files to make them smaller it seemed to work well, and give me exactly the type of output I'm after. It seems to be the best option for now, as the very large files were just a one off - all the rest are smaller.

Thanks again.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to compare two files line by line and print the line which is same. nancypriyanjali Programming 9 05-30-2011 11:04 PM
match and combine 2 text files line by line Lowellj Linux - Newbie 9 03-21-2011 09:21 PM
BASH: read every line in the files and use the line as parameters as another program tam3c36 Programming 10 12-07-2010 02:42 PM
[SOLVED] open two text files , read them line by line and update parameters of the 3rd file rastin_nz Programming 17 10-20-2010 08:10 PM
BASH: Each line of multiple text files gets added to one line Gavin Harper Programming 3 09-12-2010 08:31 PM > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 06:24 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration