LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 07-03-2012, 10:42 AM   #1
udiubu
Member
 
Registered: Oct 2011
Posts: 54

Rep: Reputation: Disabled
unmatching strings between two files


Dear all,

I have two files like the ones below:

file_1.txt

alpha 3 5 eu rt
beta 4 5 ew sd
gamma 4 56 er df
delta 23 13 rt rt

file_2.txt

alpha 3 5 eu rt
pluto 2 1 rf gf
gamma 4 56 er df
mouse 23 13 rt rt

I would like to compare them and get a third file showing only those lines having strings appearing in file_2.txt column 1, but not in file_1.txt column 1:

file_output.txt

pluto 2 1 rf gf
mouse 23 13 rt rt


I'am using something like:
Code:
for i in `cat file_2.txt`; 
do 
echo $i|grep -v -f file_1.txt; 
done > file_output.txt
This seems to work. The only problem is that file_output is not showing lines, but rather a long one-column vector:

pluto
2
1
rf
gf
mouse
23
13
rt
rt

Any reason for that?

Any help/suggestion is highly appreciated!

Best,

Udiubu


P.S. I'm using a Mac Terminal right now
 
Old 07-03-2012, 12:06 PM   #2
Kustom42
Senior Member
 
Registered: Mar 2012
Distribution: Red Hat
Posts: 1,588

Rep: Reputation: 412Reputation: 412Reputation: 412Reputation: 412Reputation: 412
Are you looking for it to display line numbers? If so that is a functionality of the program used to view/edit the file and not an issue with your command.

Your command is doing exactly what it should be doing, a for loop will do each thing one time for the amount of arguments provided. So in this case it is running the echo command 10 times and appending your file. If you are looking to reformat or append line numbers you need to look at using additional utilities like awk or modify your for loop.

Let us know if you have something more specific you want assistance with but from what you've posted I don't see any problems.
 
Old 07-03-2012, 12:14 PM   #3
Alchemikos
Member
 
Registered: Jun 2012
Location: Porto Alegre-Brazil
Distribution: Slackware- 14, Debian Wheezy, Ubuntu Studio, Tails
Posts: 88

Rep: Reputation: 6
Hello

1- Sort the files:
File 1 >
Code:
# sort file_1.txt > f1.txt
File 2>
Code:
# sort file_2.txt > f2.txt
Compare and specify the output in your case 1 and 3 >
Code:
# comm -13  f1.txt f2.txt > fileout.txt
Verify:
Code:
# nano fileout.txt
Or putting all lines in one time, and after view with nano or your favorite editor:

Code:
# sort file_1.txt > fa.txt ; sort file_2.txt > fb.txt ; comm -13  fa.txt fb.txt > out.txt ; nano out.txt
I Hope that's it help you, Here I got:
mouse 23 13 rt rt
pluto 2 1 rf gf



Cheers

Alchemikos

Last edited by Alchemikos; 07-03-2012 at 12:33 PM.
 
Old 07-03-2012, 12:50 PM   #4
udiubu
Member
 
Registered: Oct 2011
Posts: 54

Original Poster
Rep: Reputation: Disabled
HI Alchemikos,

Thanks for your reply.

I do not understand why you sort files and most important, why you say "(specify the output) in your case 1 and 3.
What do you mean by 1 and 3 exactly? If you mean "exclude 1 and 3" this is not good.
Note that the files to be compared are very long and of different length, so I cannot specify lines to be excluded.
I just want the script to look for all those strings in file_2 that are not present in file_1.

I thank you anyways!

Best,

Udiubu
 
Old 07-03-2012, 01:14 PM   #5
montel
Member
 
Registered: Jun 2012
Location: Canada
Distribution: Ubuntu/Debian/CentOS
Posts: 45

Rep: Reputation: 18
I have done something similar, but comparing file numbers and grabbing ones that were not matching in a second file and putting them into a new file. This might not be the best way to do things, but it worked in my situation.

The whole script I used is here

files:
firstFile
date file number name description
06-06-2012 0224115548 John Doe He is one stand up guy
06-07-2012 0224125743 Jane Doe A people person
06-08-2012 0224196541 Bob Awesome His last name is Awesome!

secFile
date file number name description
06-06-2012 0224115548 John Doe He is one stand up guy
06-07-2012 0224125743 Jane Doe A people person

newFile
date file number name description
06-08-2012 0224196541 Bob Awesome His last name is Awesome!

Code:
#The while loop will go through each line in the file "$secFile"
while read line ; do
	
	#In my files, I am looking for any line that contains 2241 with anything after it (till the end of the word)
        output=`echo $line | grep -o "2241\w*"`
        #If the string is not null, it will store it in the array, and add one onto the counter 

        if [[ -n "$output" ]]
        then
                myarray[$i]=$output
                #i is keeping track of the number of $output stored into the array i=`expr $i + 1`
                i=`expr $i + 1`
        fi
        count=`expr $count + 1`
#Give the while loop the secFile
done < $secFile

#This while loop goes through the lines of $firtFile, and check them against the newly created array.
while read line ; do
	#match is to see if the line and the array match.  If they do, it will trigger match to equal 1, and not do anything with that line. 
        #If it does not match, it will echo the line into a new file.
        match=0
        compare=`echo "$line" | grep -o "02241\w*"`

	#If $compare is not null, then it will continue into the if statement.
        if [ -n "$compare" ]
        then
        	#Looping through myarray, this gets the number of entries in the array ( ${#myarray[@]} ) and will only execute the loop till that number is met.
                for (( x=0 ; x < ${#myarray[@]} ; x++ )) do
			#If the entry in myarray matches $compare or matches with a 0 infront of the array, it will change match to 1.
                        if [ "${myarray[$x]}" == "$compare" ] || [ "0${myarray[$x]}" == "$compare" ]
                        then
                                match=1
                        fi

                done

        fi

if [ $match = 0 ]
then
	#You can put the new file wherever you like, I have just put an example here that the script will write to the file: newfile.txt in the directory: missedLines.
        echo "$line" >> "/missedLines/newfile.txt"
fi

done < "$firstFile"
I can go into more detail if it is something that you think will work in your situation. Also, I am extremely tired this morning, so if this doesn't make sense im sorry.
 
Old 07-03-2012, 01:28 PM   #6
Alchemikos
Member
 
Registered: Jun 2012
Location: Porto Alegre-Brazil
Distribution: Slackware- 14, Debian Wheezy, Ubuntu Studio, Tails
Posts: 88

Rep: Reputation: 6
Quote:
Originally Posted by udiubu View Post
HI Alchemikos,

Thanks for your reply.

I do not understand why you sort files and most important, why you say "(specify the output) in your case 1 and 3.
What do you mean by 1 and 3 exactly? If you mean "exclude 1 and 3" this is not good.
Note that the files to be compared are very long and of different length, so I cannot specify lines to be excluded.
I just want the script to look for all those strings in file_2 that are not present in file_1.

I thank you anyways!

Best,

Udiubu
Hello Udiubu

The command 'comm' produce three-column output. Column one contains lines unique to FILE1,
column two contains lines unique to FILE2, and column three contains lines common to both files.


-1
suppress column 1 (lines unique to FILE1)
-2
suppress column 2 (lines unique to FILE2)
-3
suppress column 3 (lines that appear in both files)


comm -13 = columm 3 - columm 1 > that's the difference, that's lines unique to FILE2.

I sorted the files because the comm need to work with them sorted, or isn't work.

Did you ran the commands?
 
Old 07-03-2012, 01:28 PM   #7
udiubu
Member
 
Registered: Oct 2011
Posts: 54

Original Poster
Rep: Reputation: Disabled
The issue has been solved by using Alchemikos suggestion.
I just for got a step.

Thanks to everybody.
 
Old 07-03-2012, 02:03 PM   #8
Alchemikos
Member
 
Registered: Jun 2012
Location: Porto Alegre-Brazil
Distribution: Slackware- 14, Debian Wheezy, Ubuntu Studio, Tails
Posts: 88

Rep: Reputation: 6
Nice

Hey Udiubu click the (yes) button at the bottom I'm crazy for the light-green squares.. Hehee


Alchemikos
 
1 members found this post helpful.
  


Reply

Tags
column, matching, text


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Searching and replacing strings in a file with strings in other files xndd Linux - Newbie 16 07-29-2010 02:40 PM
replacing strings in many files using tcsh mcbenus Linux - Software 5 03-03-2008 05:50 PM
getting strings for all the files and folders in a folder Hard_Working_ Programming 6 04-07-2007 10:55 AM
Searching files for strings tmoorman Linux - Software 4 01-08-2004 01:46 PM
concatenating strings to open files veilig Programming 1 11-10-2003 05:36 PM


All times are GMT -5. The time now is 01:00 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration