ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hi, is there a ways to read multiple files in a single awk command?
For example:
Code:
awk -f file1 file2 file3
I've search about it with google, most of them suggest using FNR. But I don't understand how it works. It will be a great help if someone able to explain it in simple term with some example.
I'm sorry, I've miss some part. The command suppose to be like this
Code:
awk -f awk_script file1 file2 file3
What I'm trying to do is I write some awk script to read multiple file and process it. But the problem is I don't understand how to do it.
Awk can handle multiple input files (the green part) by default. In the example given awk will start reading file1, one line at the time and when no more lines are present it will continue with file2 and then file3.
The blue part tells awk to get its commands from a file called awk_script. The programming logic ("commands") can be found inside it.
I'm not sure if and why you need the FNR variable. This variable holds the line number it processes (if multiple input files are used it starts with 1 again if it starts with a new input file).
Here's a very basic example:
Content of the input files:
Code:
$ cat file1
a
b
c
$ cat file2
1
2
3
$ cat file3
A
B
C
Content of the awk_script file:
Code:
$ cat awk_script
BEGIN { print "Start awk script" }
{
print "File line number:", FNR, " - Line content:", $0
}
END { print "End awk script" }
And if you execute the above you will get this:
Code:
$ awk -f awk_script file1 file2 file3
Start awk script
File line number: 1 - Line content: a
File line number: 2 - Line content: b
File line number: 3 - Line content: c
File line number: 1 - Line content: 1
File line number: 2 - Line content: 2
File line number: 3 - Line content: 3
File line number: 1 - Line content: A
File line number: 2 - Line content: B
File line number: 3 - Line content: C
End awk script
FNR combined with NR and FILENAME are the way to go.
FNR tells the current record number of all files and NR of the current file.
Eg. assume that file1 has 2000 records and file2 has 1921 and file 3 has 4000.
So long as FNR==NR, you are on the first file.
FILENAME tells the name of the current file. So you can decide what to do based on this.
Actually the usage is not uncommon.
First file could have a different format/content and so on. So you need to differentiate between the files.
IMO, FNR is best used with only two files, and using it gets more complex when you get into three or more. It's also, I believe, a gawk/nawk extension and isn't available in traditional awk.
A more robust solution would probably require testing the ARGC/ARGV values that keep track of the input arguments.
Now, as for your specific question, we really need more than some vaguely-worded half-explanations about what you want to do. Please explain your exact goals in more detail, along with some examples of both input and output, and perhaps a bit of the overall coding context, so that we can understand you better. The exact methods to use often depend very much on the particulars of the coding situation, and without proper background knowledge we can only give you guesses and general suggestions.
IMO, FNR is best used with only two files, and using it gets more complex when you get into three or more. It's also, I believe, a gawk/nawk extension and isn't available in traditional awk.
A more robust solution would probably require testing the ARGC/ARGV values that keep track of the input arguments.
Now, as for your specific question, we really need more than some vaguely-worded half-explanations about what you want to do. Please explain your exact goals in more detail, along with some examples of both input and output, and perhaps a bit of the overall coding context, so that we can understand you better. The exact methods to use often depend very much on the particulars of the coding situation, and without proper background knowledge we can only give you guesses and general suggestions.
I have few files with the format:
File1
Code:
Red Apple 8 3
Orange 10 4
Tomatoes 10 5
File2
Code:
Orange 5 5
Red Apple 10 4
Tomatoes 11 3
File3
Code:
Tomatoes 5 4
Orange 5
Red Apple 3
The $2 is the quantities whereas the $3 is the price. I'm require to process these 3 files with a written awk script. I've look up several sites and they recommend the using of FNR==NR or the FILENAME. I'm not sure how to use it and not sure which is the best option to use. It will be a great help if you can show me an example and explain the usage (I'm the kind of guy who pick up slow)
As you might have noticed we are willing to help you, but.....
You still haven't told us what it is that needs to be done with the content of these files.
- Add the similar entries (quantities and/or price) per file?
- Add the similar entries (quantities and/or price) for all files?
- Calculate the cost for each entry per file?
- Calculate the total cost for each file?
- Calculate the total cost for all the file?
- ???
- ???
Please tell us what needs to be done so we can point you in the correct direction (which might or might not need to use of FNR).
As you might have noticed we are willing to help you, but.....
You still haven't told us what it is that needs to be done with the content of these files.
- Add the similar entries (quantities and/or price) per file?
- Add the similar entries (quantities and/or price) for all files?
- Calculate the cost for each entry per file?
- Calculate the total cost for each file?
- Calculate the total cost for all the file?
- ???
- ???
Please tell us what needs to be done so we can point you in the correct direction (which might or might not need to use of FNR).
I'm really sorry! I'm not really good at explaining things.
What I'm trying to do is calculate the total cost of each fruit/vegetable in every files and output in a sorted format depending on the total. For example, the total cost of Red Apple in each file is 8*3 + 10*4 + 3*3=81, Tomatoes is 103, Orange is 80. In the output it will be like this
I do believe I understand what it is you are after, I do need some more info:
- Are the given examples correct: file3 seems to be missing some information.
- Are all fields separated by spaces. One fruit (Red Apple) has a space in its name, which makes this more challenging.
- Your math is also not correct 8*3 + 10*4 + 3*3=81 73
Please be careful with the examples posted, it needs to reflect the correct input used.
EDIT: You mention that the output needs to be sorted: On what field? The name or the highest/lowest total price?
I do believe I understand what it is you are after, I do need some more info:
- Are the given examples correct: file3 seems to be missing some information.
- Are all fields separated by spaces. One fruit (Red Apple) has a space in its name, which makes this more challenging.
- Your math is also not correct 8*3 + 10*4 + 3*3=81 73
Please be careful with the examples posted, it needs to reflect the correct input used.
EDIT: You mention that the output needs to be sorted: On what field? The name or the highest/lowest total price?
I'm sorry I'm too clumsy
file3:
Code:
Tomatoes 5 4
Orange 5 4
Red Apple 3 4
Yes, the field is separated with spaces and you're right with the calculation too.
The Red Apple entry makes it more challenging due to the extra field it creates (the Red Apple lines has 4 fields and the rest have 3 fields). But it is possible, here's one way:
The above code uses an array called fruits to store the fruit and its total cost.
The green line looks for lines that do not (the !) start with Red. If this is the case then the name of the fruit ($1) is stored as the unique index and fields 2 and 3 are multiplied and added to the value present.
The blue line looks for entries that start with Red, but now the index consists of 2 fields ($1 = Red and $2 = Apple). The amount and price are now $3 and $4.
Once all the files are processed, the brown part is executed. This prints all the entries in the array (index and value).
A sample run with the 3 files you posted as input:
Code:
$ awk -f awk_script file1 file2 file3
Tomatoes 103
Red Apple 76
Orange 85
And you might have noticed that FNR isn't needed.
BTW: You did not answer the question about sorting the output. If that is needed use the sort command. Sorting in (g)awk is possible but cumbersome.
The Red Apple entry makes it more challenging due to the extra field it creates (the Red Apple lines has 4 fields and the rest have 3 fields). But it is possible, here's one way:
The above code uses an array called fruits to store the fruit and its total cost.
The green line looks for lines that do not (the !) start with Red. If this is the case then the name of the fruit ($1) is stored as the unique index and fields 2 and 3 are multiplied and added to the value present.
The blue line looks for entries that start with Red, but now the index consists of 2 fields ($1 = Red and $2 = Apple). The amount and price are now $3 and $4.
Once all the files are processed, the brown part is executed. This prints all the entries in the array (index and value).
A sample run with the 3 files you posted as input:
Code:
$ awk -f awk_script file1 file2 file3
Tomatoes 103
Red Apple 76
Orange 85
And you might have noticed that FNR isn't needed.
BTW: You did not answer the question about sorting the output. If that is needed use the sort command. Sorting in (g)awk is possible but cumbersome.
Thanks for the explanation and example! It's helpful!! <3
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.