Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
alpha 3 5 eu rt
beta 4 5 ew sd
gamma 4 56 er df
delta 23 13 rt rt
file_2.txt
alpha 3 5 eu rt
pluto 2 1 rf gf
gamma 4 56 er df
mouse 23 13 rt rt
I would like to compare them and get a third file showing only those lines having strings appearing in file_2.txt column 1, but not in file_1.txt column 1:
file_output.txt
pluto 2 1 rf gf
mouse 23 13 rt rt
I'am using something like:
Code:
for i in `cat file_2.txt`;
do
echo $i|grep -v -f file_1.txt;
done > file_output.txt
This seems to work. The only problem is that file_output is not showing lines, but rather a long one-column vector:
Are you looking for it to display line numbers? If so that is a functionality of the program used to view/edit the file and not an issue with your command.
Your command is doing exactly what it should be doing, a for loop will do each thing one time for the amount of arguments provided. So in this case it is running the echo command 10 times and appending your file. If you are looking to reformat or append line numbers you need to look at using additional utilities like awk or modify your for loop.
Let us know if you have something more specific you want assistance with but from what you've posted I don't see any problems.
I do not understand why you sort files and most important, why you say "(specify the output) in your case 1 and 3.
What do you mean by 1 and 3 exactly? If you mean "exclude 1 and 3" this is not good.
Note that the files to be compared are very long and of different length, so I cannot specify lines to be excluded.
I just want the script to look for all those strings in file_2 that are not present in file_1.
I have done something similar, but comparing file numbers and grabbing ones that were not matching in a second file and putting them into a new file. This might not be the best way to do things, but it worked in my situation.
files:
firstFile
date file number name description
06-06-2012 0224115548 John Doe He is one stand up guy
06-07-2012 0224125743 Jane Doe A people person
06-08-2012 0224196541 Bob Awesome His last name is Awesome!
secFile
date file number name description
06-06-2012 0224115548 John Doe He is one stand up guy
06-07-2012 0224125743 Jane Doe A people person
newFile
date file number name description
06-08-2012 0224196541 Bob Awesome His last name is Awesome!
Code:
#The while loop will go through each line in the file "$secFile"
while read line ; do
#In my files, I am looking for any line that contains 2241 with anything after it (till the end of the word)
output=`echo $line | grep -o "2241\w*"`
#If the string is not null, it will store it in the array, and add one onto the counter
if [[ -n "$output" ]]
then
myarray[$i]=$output
#i is keeping track of the number of $output stored into the array i=`expr $i + 1`
i=`expr $i + 1`
fi
count=`expr $count + 1`
#Give the while loop the secFile
done < $secFile
#This while loop goes through the lines of $firtFile, and check them against the newly created array.
while read line ; do
#match is to see if the line and the array match. If they do, it will trigger match to equal 1, and not do anything with that line.
#If it does not match, it will echo the line into a new file.
match=0
compare=`echo "$line" | grep -o "02241\w*"`
#If $compare is not null, then it will continue into the if statement.
if [ -n "$compare" ]
then
#Looping through myarray, this gets the number of entries in the array ( ${#myarray[@]} ) and will only execute the loop till that number is met.
for (( x=0 ; x < ${#myarray[@]} ; x++ )) do
#If the entry in myarray matches $compare or matches with a 0 infront of the array, it will change match to 1.
if [ "${myarray[$x]}" == "$compare" ] || [ "0${myarray[$x]}" == "$compare" ]
then
match=1
fi
done
fi
if [ $match = 0 ]
then
#You can put the new file wherever you like, I have just put an example here that the script will write to the file: newfile.txt in the directory: missedLines.
echo "$line" >> "/missedLines/newfile.txt"
fi
done < "$firstFile"
I can go into more detail if it is something that you think will work in your situation. Also, I am extremely tired this morning, so if this doesn't make sense im sorry.
I do not understand why you sort files and most important, why you say "(specify the output) in your case 1 and 3.
What do you mean by 1 and 3 exactly? If you mean "exclude 1 and 3" this is not good.
Note that the files to be compared are very long and of different length, so I cannot specify lines to be excluded.
I just want the script to look for all those strings in file_2 that are not present in file_1.
I thank you anyways!
Best,
Udiubu
Hello Udiubu
The command 'comm' produce three-column output. Column one contains lines unique to FILE1,
column two contains lines unique to FILE2, and column three contains lines common to both files.
-1
suppress column 1 (lines unique to FILE1)
-2
suppress column 2 (lines unique to FILE2)
-3
suppress column 3 (lines that appear in both files)
comm -13 = columm 3 - columm 1 > that's the difference, that's lines unique to FILE2.
I sorted the files because the comm need to work with them sorted, or isn't work.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.