LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Extracting second line from multiple txt files and append to a file (https://www.linuxquestions.org/questions/linux-newbie-8/extracting-second-line-from-multiple-txt-files-and-append-to-a-file-4175421733/)

sal_x_sal 08-12-2012 02:00 PM

Extracting second line from multiple txt files and append to a file
 
Hi,

I am trying to use an awk command to extract the second line from a bunch of txt files (tab delimited) and append them to a new file.

I use the command below to extract a second line from a single txt file, and it works fine.
awk 'NR==2' old_file1.txt > new_file1.txt

I know I have to have some kind of a loop to go over all the input files but I am not very much familiar with awk commands, any help will be much appreciated.

Thanks!

Didier Spaier 08-12-2012 02:21 PM

"man bash" + Bash Guide for Beginners and others at http://tldp.org

i_joh 08-12-2012 02:30 PM

Code:

for FILE in *.txt; do sed -n '2p' $FILE; done > output
Just change *.txt to whatever. Could be a list of files separated by spaces too.

sal_x_sal 08-12-2012 02:57 PM

Quote:

Originally Posted by i_joh (Post 4752530)
Code:

for FILE in *.txt; do sed -n '2p' $FILE; done > output
Just change *.txt to whatever. Could be a list of files separated by spaces too.


Thanks for the response. I tried that and it worked perfectly except for one thing, the order is not maintained. May be I should ask this: how are the input files processed? Isnt it by file name? I wanted my output to be the same order as the inputfiles so that I can know which output line is coming from which file.

i_joh 08-12-2012 03:10 PM

I get alphabetical sorting. What happens is that *.txt is replaced with all the .txt files in the current directory, each separated by a space. Then the first file name is put in $FILE and the command run. Then the second file name is put in $FILE and the command run, etc until all files are processed. An alternative would be to list the files manually in the order you want them listed, or sort them with the ls command (see man ls for options):

Code:

for FILE in $(ls -r); do sed -n '2p' "$FILE"; done
That would sort them in reverse.

sal_x_sal 08-12-2012 03:13 PM

Quote:

Originally Posted by i_joh (Post 4752530)
Code:

for FILE in *.txt; do sed -n '2p' $FILE; done > output
Just change *.txt to whatever. Could be a list of files separated by spaces too.

Or is there a way to include a column in the output file containing the name of the input files? So every time a second line is read from the input file, it is appended to the output file with an extra column added containing the file name. This would be equally usefull if it is impossible to guarantee that the input files will be processed in the same order. Thanks!

i_joh 08-12-2012 03:16 PM

Quote:

Originally Posted by sal_x_sal (Post 4752551)
Or is there a way to include a column in the output file containing the name of the input files? So every time a second line is read from the input file, it is appended to the output file with an extra column added containing the file name. This would be equally usefull if it is impossible to guarantee that the input files will be processed in the same order. Thanks!

Code:

for FILE in *.txt; do echo "$FILE: $(sed -n '2p' $FILE)"; done > output

ntubski 08-12-2012 03:24 PM

Code:

awk 'FNR==2{print FILENAME, $0}' *.txt > output

sal_x_sal 08-12-2012 03:40 PM

Thank you both for the responses. Both work perfect and produced the same result. One little thing if it is easier to do in awk, otherwise I will have to do the hard way. Basically, the filenames are now mixed together with the first column of the input file. Is there a way to have a tab after the file name, so the input file name is in a different column? Thank you very much once again!

i_joh 08-12-2012 03:45 PM

Quote:

Originally Posted by sal_x_sal (Post 4752566)
Thank you both for the responses. Both work perfect and produced the same result. One little thing if it is easier to do in awk, otherwise I will have to do the hard way. Basically, the filenames are now mixed together with the first column of the input file. Is there a way to have a tab after the file name, so the input file name is in a different column? Thank you very much once again!

Code:

for FILE in *.txt; do echo -e "$FILE\t$(sed -n '2p' $FILE)"; done > output
I'd recommend you learn shell programming. It's kind of what makes Linux/BSD so much better than Windows.

ntubski 08-12-2012 03:48 PM

Code:

awk -vOFS='\t' 'FNR==2{print FILENAME, $0}' *.txt > output

sal_x_sal 08-12-2012 04:00 PM

Both commands work perfect. Thank you very much i_joh and ntubski for your time in helping me with this. I appreciate it.

@i_joh - I am so much sucked working with windows in my daily life. Once in a while when I have to use linux I realize how powerful it is. I try to learn linux on my own but since I am not using it much I end up forgetting a lot of stuff. Thanks once again and have a good day!


All times are GMT -5. The time now is 06:14 PM.