LinuxQuestions.org - [SOLVED] Selecting lines according to position in a file

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - Selecting lines according to position in a file (https://www.linuxquestions.org/questions/programming-9/selecting-lines-according-to-position-in-a-file-4175477448/)

danielbmartin

09-17-2013 08:57 AM

Selecting lines according to position in a file

Have: a file called numbs consisting of positive integers, one per line ...

Code:

The numbers are not sorted and not necessarily unique.

Have: a file called words consisting of text ...

Code:

one

two

three

four

five

six

seven

eight

nine

The text could just as well be the names of colors or states or lines from a poem.

The desired output is a file consisting of lines selected from words according to the numbers in numbs ...

Code:

three

six

two

eight

eight

two

eight

As a matter of personal style I strive to write code which does not have explicit loops. I've tried several approaches which don't work. If you can do this without loops, please show how. Solutions which use commands such as sed, grep, and awk are preferred. I haven't learned perl or ruby.

These are failed attempts:

Code:

sed 's/$/p/' $Numbs >$Work

sed -nf $Work $Words >$OutFile

Code:

awk 'FNR==NR{a[$0];next}FNR in a' $Numbs $Words >$OutFile

Daniel B. Martin

druuna

09-17-2013 09:10 AM

Is this what you are looking for (bash only solution):

Code:

#!/bin/bash



no=1 # words file starts with one, arrays with zero)

while read TXTNO

do

  numcheck[$no]="$TXTNO"

  (( no++ ))

done < words



while read NUMBER

do

  echo ${numcheck[$NUMBER]}

done < numbs

danielbmartin

09-17-2013 09:19 AM

Quote:

Originally Posted by druuna (Post 5029192)

Is this what you are looking for ...

Thank you for working code... but... I am seeking a loop-less solution. Your bash solution uses two while loops.

Daniel B. Martin

druuna

09-17-2013 09:23 AM

From a technical point of view those are 2 loops, but....

The way they are implemented in my solution they work like sed or awk (read each file once, one line after another). If that is considered a loop then both sed and awk cannot be used either ;)

YankeePride13

09-17-2013 09:36 AM

Quote:

Originally Posted by danielbmartin (Post 5029197)

Thank you for working code... but... I am seeking a loop-less solution. Your bash solution uses two while loops.

Daniel B. Martin

Why? Seems like a pretty silly requirement.

ntubski

09-17-2013 09:55 AM

Let's start with obvious loop:

Code:

while read n ; do sed -n ${n}p words ; done < numbs

We can use xargs to make the loop implicit:

Code:

xargs -I{} sed -n {}p words < numbs

Both cases use O(nw) time, O(1) space (n is size of numbs, w is size of words)

awk solution (equivalent to druuna's bash solution):

Code:

awk 'NR==FNR { w[NR] = $0; next } { print w[$0] }' words numbs

This trades space O(w) for time O(n + w).

danielbmartin

09-17-2013 10:18 AM

Quote:

Originally Posted by YankeePride13 (Post 5029212)

Why? Seems like a pretty silly requirement.

I am retired and do programming as recreation, hoping to stave off the inevitable Old Age Brain Rot. I dream up silly problems with silly requirements for amusement. This is one which I couldn't solve on my own so I called on the LQ experts.

Daniel B. Martin

YankeePride13

09-17-2013 10:32 AM

Quote:

Originally Posted by danielbmartin (Post 5029243)

Good idea. I would suggest donating your time to open source projects as that not only will help stave off the Old Age Brain Rot, but it will also produce something that others can use and work off of. Plus you'll learn a whole lot. That will, of course, force you to learn a programming language and not rely on Linux commands. :)

danielbmartin

09-17-2013 10:37 AM

Quote:

Originally Posted by ntubski (Post 5029226)

awk solution (equivalent to druuna's bash solution):

Code:

awk 'NR==FNR { w[NR] = $0; next } { print w[$0] }' words numbs

Superb!

For the sake of learning: Please explain why my attempt with sed ...

Code:

sed 's/$/p/' $Numbs >$Work

sed -nf $Work $Words >$OutFile

... generates this OutFile ...

Code:

two

two

three

six

eight

eight

eight

All the desired words are there, but not in the desired order.

Can my sed be changed to make it work, and still be loop-less?

Daniel B. Martin

firstfire

09-17-2013 11:07 AM

Hi, Daniel.

Your sed solution does not work, because on the second line you execute sed script $Work on each line of file $Words. For example, you apply script "3p;6p;2p;8p;8p;2p;8p;" to an input line number 2. The script contains '2p' two times, so sed prints second line two times.

EDIT: Wow, it's my 600-th post here :)

ntubski

09-17-2013 11:21 AM

Quote:

Can my sed be changed to make it work, and still be loop-less?

The basic idea of using sed to write a sed program can be used like so:

Code:

sed = words | sed '1~2s:^.*$:/&/i\\:' | sed -nf - numbs

Unfortunately, unless sed does some heroic optimization, I think this solution is O(nw) in time and O(w) in space.

firstfire

09-17-2013 11:30 AM

Hi again!

Maybe try another tool?:

Code:

$ ed words < nums  

45

three

six

two

eight

eight

two

eight

(45 goes to standard error)

This works, because ed treats numbers on stdin as line addresses to print.

Read info ed for more info.

danielbmartin

09-17-2013 12:16 PM

Quote:

Originally Posted by firstfire (Post 5029293)

Wow, it's my 600-th post here

Congratulations on achieving this milestone and garnering 336 reputation points. That's impressive!

Code:

Maybe try another tool?:

$ ed words < nums

This wins a prize for being concise.

Daniel B. Martin

druuna

09-17-2013 12:59 PM

Quote:

Originally Posted by firstfire (Post 5029293)

Maybe try another tool?:
[code]
$ ed words < nums

Nice solution!

firstfire

09-18-2013 08:57 AM

Hi.

Quote:

Originally Posted by danielbmartin (Post 5029322)

Congratulations on achieving this milestone and garnering 336 reputation points. That's impressive!

Hmm.. After your post my reputation suddenly jumped 12 points up. That's impressive! :) Thanks guys!

Quote:

Originally Posted by YankeePride13

Why? Seems like a pretty silly requirement.

Daniel, please keep asking your questions -- they are very good for limbering up my brains, I enjoy solving them. And your english reminds me of Shakespeare :)

All times are GMT -5. The time now is 12:16 AM.