LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   Selecting lines based on position (http://www.linuxquestions.org/questions/programming-9/selecting-lines-based-on-position-926632/)

danielbmartin 01-30-2012 09:36 PM

Selecting lines based on position
 
I want to select lines based on their position (i.e. line number).

Sample of File1...
Quote:

Alabama
Alaska
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
Florida
Georgia
Sample of File2...
Quote:

2
3
5
7
Desired output file...
Quote:

Alaska
Arizona
California
Connecticut
This can be done by using "nl" to number the lines in File1 and using "grep -f" but that seems like a brute force technique. Is there a clever and efficient way to do this?

Daniel B. Martin

firstfire 01-30-2012 11:01 PM

Hi.

Code:

# save to select.awk
BEGIN{
        getline
        file=FILENAME
        while(file == FILENAME){
                nums[$0]=1;
                getline
                }
}
(FNR in nums)

Provide a file with numbers as a first argument
Code:

$ awk -f select.awk numbers.txt file1.txt
Alaska
Arizona
California
Connecticut


grail 01-31-2012 08:02 AM

Can be a little simpler:
Code:

awk 'FNR == NR{nums[$0];next}FNR in nums' numbers.txt file1.txt

danielbmartin 01-31-2012 12:43 PM

Quote:

Originally Posted by grail (Post 4589372)
Code:

awk 'FNR == NR{nums[$0];next}FNR in nums' numbers.txt file1.txt

Thank you, grail, for this crisp solution. However, when tested with the sample data shown in the original post the output contained only three state names:
Quote:

Alaska
Arizona
California
Daniel B. Martin

colucix 01-31-2012 02:30 PM

Apparently your samples above contain a trailing blank space at the end of the last line (after Georgia and 7 respectively). Therefore the last number is not treated as such, but as a literal string of two characters.

A slight modification to the grail's code should reveal the arcane:
Code:

awk 'FNR == NR{nums[$0+0];next}FNR in nums' numbers file
the +0 forces $0 to be treated as a number (the blank space is ignored) and the subsequent expression will match as expected.

grail 02-01-2012 01:50 AM

Thanks for the fix colucix :) I often forget there can be other dodginess around ;)

colucix 02-01-2012 02:06 AM

Hi grail! :) That wasn't so obvious. I'd have never thought about that if I hadn't tried to copy/paste the sample input in a terminal.

Nominal Animal 02-01-2012 06:34 AM

@colucix: Since the key is a line number, and thus integer, wouldn't nums[int($1)] be cleaner?

BTW, I did not know that array element reference was enough to define it. The GNU Awk manual does mention it -- A reference to an element that does not exist automatically creates that array element, with the null string as its value -- and not as specific to GNU awk. I just hadn't encountered that before. Useful; thanks!

colucix 02-01-2012 07:20 AM

Quote:

Originally Posted by Nominal Animal (Post 4590363)
@colucix: Since the key is a line number, and thus integer, wouldn't nums[int($1)] be cleaner?

Indeed. It would be more cleaner and readable, especially for awk newbies. Thanks for the tip.

danielbmartin 02-01-2012 09:40 AM

Quote:

Originally Posted by colucix (Post 4589638)
Apparently your samples above contain a trailing blank space at the end of the last line ...

Quite right, and I was unaware of this.

1) Thank you to everyone who contributed ideas, code, corrections, and comments to this thread. It is SOLVED!

2) I use gnome-terminal to browse a file. In this instance I was tripped by trailing blanks which are invisible to the unaided eye. I'd like to use a browser which distinguishes white space with a colored dot or some such. Is there a user setting for gnome-terminal which does this? Is there a different utility which can do this?

Daniel B. Martin

David the H. 02-01-2012 09:58 AM

The same technique can be done directly in bash too, Version 4's new mapfile built-in makes it particularly easy.

Code:

mapfile -t -O 1 states <file.txt

for i in 2 3 5 7; do
        echo "${states[i]}"
done

The -O option lets you set the starting index number, so you can store the first line as index 1 instead of the default 0, for one-to-one mapping.

For older bash versions, use a read loop to populate the array:
Code:

while IFS= read -r line; do
        states[++i]=$line
done < file.txt

Edit: One more solution using sed and a loop.

Code:

for i in 2 3 5 7; do
        sed -n "${i}p" file.txt
done

It's a bit more processor-intensive though, as it runs multiple commands.

Edit 2: Oh, but for just a simple list:

Code:

sed -n "2p; 3p; 5p; 7p"  file.txt


All times are GMT -5. The time now is 08:40 AM.