ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Seeking your initiative to follow the pattern in the script and do it yourself.
do you know how that awk script works? In that case you can find easily the way.....
awk ' NR==FNR { a[$2]=$4; next } ($1 in a) { print a[$1]$0 } ' file1 file2
The problem statement calls for matching two files on a key value.
The linux command join is suitable for this task.
InFile1 ...
Code:
frank 101 4544444 glass
fahad 102 4547977 car
herman 103 454212 clock
charles 107 454822 television
alfred 115 454629 radio
david 117 454133 table
george 122 454009 desk
InFile2 ...
Code:
101 transfer 888
105 transfer 999
106 sold 123
111 stolen 345
115 destroyed 234
122 missing 666
The awk solution, which obviously needs a little tweak ( I believe pan64 is hinting the direction without giving the full solution, which I like as it is clear the OP has not done enough investigation
yet), does not require the data to be sorted just that the value of the array index is unique.
The awk solution ... does not require the data to be sorted just that the value of the array index is unique.
The short sample input files provided by OP have the key fields in sorted order. He did not explicitly state that his files are already sorted, and we cannot know the details of his application. If those files are already sorted there may be a performance advantage in using join.
I don't know the internals of awk, but technical intuition suggests that the ($1 in a) part of
Code:
awk ' NR==FNR { a[$2]=$4; next } ($1 in a) { print a[$1]$0 } ' file1 file2
results in a serial search. If the input files are large, a serial search can be painfully slow.
hmmm ... not sure on the serial search idea (is an indexed array using numbers a serial search when you say is N in array?), but a sorted system of course will return faster results
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.