ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I want to compare 2 files and get a new output that will contain the differences. Each file contain 5 fields (matricule, first name, last name, age, profession)
file1 is the original file. file2 should be synchronized with file1:I want to look for any change on file1 and want to apply these changes on file2.
awk 'NR==FNR {f1[$0]=$0}
NR!=FNR {f2[$0]=$0}
END {
for(i in f1) if(!(i in f2)) print "Only in f1: " f1[i]
for(i in f2) if(!(i in f1)) print "Only in f2: " f2[i]
}' file1 file2
I get this result:
==============
Only in f1: 10001;georges;Hold;34;physician
Only in f1: 10003;marc;bob;46;techician
Only in f1: 10002;Catherina;Rick;36;doctor
Only in f2: 10004;Maria;Roch;39;nurse
Only in f2: 10003;marc;Robert;46;programmer
Only in f2: 10001;georges;Hold;40;physician
==============
But it is not what I hope to get and obtain as result.
I want to get a result like that:
Try writing a script with the 'diff' command. It takes two files as input and then reports the differences between them, if there are any. Read the man page for it.
Try writing a script with the 'diff' command. It takes two files as input and then reports the differences between them, if there are any. Read the man page for it.
Thanks for your suggestion, I know how to use diff, but I need to use awk, awk is a simple command to run fast than diff/grep to provide the result that I need.
This seems very contrived and makes me think this is a homework question. The sample you posted doesn't look in individual records at all, so it doesn't seem that you even wrote it yourself. If the first file is being read then 'NR==FNR' will be true. The logic in the END section tests if the records saved in the array differ. You need to change what you do if they differ and test which fields differ in that case.
file2 should be synchronized with file1:I want to look for any change on file1 and want to apply these changes on file2.
This suggests that you could just copy file1 to file2.
Perhaps what you meant to say is that, if file1 has data for a particular field which is different than the corresponding field in file2 (if it exists), then that field in file2 should be updated.
This suggests that you could just copy file1 to file2.
Perhaps what you meant to say is that, if file1 has data for a particular field which is different than the corresponding field in file2 (if it exists), then that field in file2 should be updated.
And, yes, why does it have to be AWK?
Hi Pixellany,
I agree with you to copy file1 to file2.
But my goal is to track the changes that were done in file1. I did not find any link where it explains clearly how to manipulate 2 files and their fields by using awk.
Since a record in one file may be missing in another file, you may want to create two arrays as you are doing, but use the first field as the index instead of the record number. Life might be easier if both files are sorted by the first field as well. The sort command can guarantee that if it might not be the case in the files.
awk -f commands.awk <(sort -t; file1) <(sort -t; file2)
Since your report is only concerned with the difference, you could use the "comm" command to filter out common lines:
comm -23 <(sort -t; file1) >temp1
comm -13 <(sort -t; file2) >temp2
awk -f commands.awk temp1 temp2 >report
Also, remember that awk arrays are one-dimensional. That means that you can't have a two dimensional array of records/fields. You will either have to decompose each field manually (in the END section logic) instead of using $1, $2, etc.; Or assign the values of an array to $0 and then create a temporary array for file1, before assigning the corresponding array element value (for file2) to $0 from the cooresponding line from the second file.
Awk arrays are associative, so the index can be a word instead of an integer. That may help. The index could be lastname or profession. That will make your awk program easier to read.
Often in Unix/Linux, your best approach is to use small tools like grep, sort and comm, each doing part of the job. Comm only works on sorted files, so that is a given. Working with only entries that differ means that the arrays can be smaller in awk as well.
just reverse the cat if you want the file precedence the other way. cat file1 file2 means that items in file2 will take the place of items in file1, which by your example looks like what you wanted.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.