LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   extracting particular lines from a text file (http://www.linuxquestions.org/questions/programming-9/extracting-particular-lines-from-a-text-file-679471/)

skuz_ball 10-27-2008 11:33 PM

extracting particular lines from a text file
 
Hello everyone,

I am writing a shell script (c-shell) and want to extract particular lines from a text file. For example I want to extract the following lines

1
3
9
15
16
25
49
etc. etc.

There is no pattern to the line numbers and I have them stored in a single column in another file.

Is there a way for the line numbers to be read from the file that has them stored in a column and then extract those lines from another file?

Any help would be greatly appreciated.

chrism01 10-28-2008 12:10 AM

One way is to use th awk with the NR variable: http://www.grymoire.com/Unix/Awk.html

skuz_ball 10-28-2008 12:32 AM

The NR command is helpful if there is a pattern in the lines that I need to extract, for example every second or third line. The lines I need extracted will vary from case to case and there is no particular pattern.

I have the line numbers stored in one file in a column
eg.

1
3
4
7

And I have another file with text, with the line number printed in front (cat -n command used to get the number of the line infront)
eg

1: blah blah blah
2: blah blah blah
3: blah blah blah
4: blah blah blah
5: blah blah blah
6: blah blah blah
7: blah blah blah
8: blah blah blah
9: blah blah blah
10: blah blah blah

How can I extract just the lines that I need?

Sergei Steshenko 10-28-2008 12:52 AM

If you do not insist in the script being written in shell, it's a trivial task in Perl.

Let me know if you need sample code.

skuz_ball 10-28-2008 12:59 AM

Perl will do, I would love to see a sample

Sergei Steshenko 10-28-2008 01:47 AM

Quote:

Originally Posted by skuz_ball (Post 3323730)
Perl will do, I would love to see a sample

Well, something like this:

Code:

#!/usr/bin/perl -w

use strict;

my %line_numbers_of_interest = # keys are the lines numbers to be extracted
  (
  25 => '',
  37 => '',
  43 => ''
  );

my $input_file = "input.txt";
my $output_file = "output.txt";

open(my $input_file_fh, '<', $input_file) or die "ERROR cannot open '$input_file' file for reading";

open(my $output_file_fh, '>', $output_file) or die "ERROR cannot open '$output_file' file for writing";

my $input_file_line_number = 1;

while(defined(my $line = <$input_file_fh>))
  {
  if(exists $line_numbers_of_interest{$input_file_line_number})
    {
    print $output_file_fh $line;
    }

  $input_file_line_number++;
  }

close($input_file);
close($output_file);

exit(0);

I haven't tried the code, but it's pretty straightforward. Let me know if you have problems with it.

abolishtheun 10-28-2008 02:11 AM

Code:

#!/usr/bin/perl -wnl
BEGIN{
  $nfile = shift or warn "usage: $0 numbers {files}\n" and die 255;
  open(DAT, $nfile) or warn "Could not open file $nfile!" and die 255;
  @nums = <DAT>;  chomp @nums;  close(DAT);
}
/^(\d+)\b/ and grep {$_ eq $1} @nums and print;

Code:

% cat numbers.txt
1
3
4
7
% cat input.txt
1: blah blah blah
2: blah blah blah
3: blah blah blah
4: blah blah blah
5: blah blah blah
6: blah blah blah
7: blah blah blah
8: blah blah blah
9: blah blah blah
10: blah blah blah
% perl filter.pl numbers.txt input.txt
1: blah blah blah
3: blah blah blah
4: blah blah blah
7: blah blah blah


burschik 10-28-2008 02:31 AM

Slightly kludgy:

Code:

sed 's/$/ p;/' < lines_file | sed -nf - input_file

Sergei Steshenko 10-28-2008 02:39 AM

Quote:

Originally Posted by abolishtheun (Post 3323772)
Code:

#!/usr/bin/perl -wnl
BEGIN{
  $nfile = shift or warn "usage: $0 numbers {files}\n" and die 255;
  open(DAT, $nfile) or warn "Could not open file $nfile!" and die 255;
  @nums = <DAT>;  chomp @nums;  close(DAT);
}
/^(\d+)\b/ and grep {$_ eq $1} @nums and print;

Code:

% cat numbers.txt
1
3
4
7
% cat input.txt
1: blah blah blah
2: blah blah blah
3: blah blah blah
4: blah blah blah
5: blah blah blah
6: blah blah blah
7: blah blah blah
8: blah blah blah
9: blah blah blah
10: blah blah blah
% perl filter.pl numbers.txt input.txt
1: blah blah blah
3: blah blah blah
4: blah blah blah
7: blah blah blah


Everyone is free to write hardly readable Perl code; "grep {$_ eq $1}" search is slow compared to hash key lookup, though in this particular case it won't be significant.

The whole
"
open(DAT, $nfile) or warn "Could not open file $nfile!" and die 255;
@nums = <DAT>;
"

part is not pretty - if we are in Perl, we better import data in Perl format directly.

abolishtheun 10-28-2008 02:45 AM

Quote:

Originally Posted by Sergei Steshenko (Post 3323786)
Everyone is free to write hardly readable Perl code

tmtowtdi. (whats not readable about it?) go rewrite it in machine code if it keeps you up at night.

Sergei Steshenko 10-28-2008 03:07 AM

Quote:

Originally Posted by abolishtheun (Post 3323788)
tmtowtdi. (whats not readable about it?) go rewrite it in machine code if it keeps you up at night.

For example, DAT is a senseless name; $nfile is not very clear either.

ghostdog74 10-28-2008 06:03 AM

please don't argue. If you want to talk about readability, don't even use Perl. There's Python, but that one aside,

Code:

# awk -F":" 'FNR==NR{_[$1]=$0;next}{print _[$0]}' file_with_blah file_with_number
1: blah blah blah
3: blah blah blah
4: blah blah blah
7: blah blah blah

@OP, usually, there's a way, to use grep , with -f option which i will leave it to you to explore.

Jacky Quah 10-28-2008 06:48 AM

If you want to used pure bash script :
Code:

#!/bin/bash
#
#script
#first argument content file name
#second argument list of number file name,

TIFS=$IFS;IFS=$'\n';
LINENUMBER=(`cat ${2}`);
CONTENT=(`cat ${1}`);
IFS=$TIFS;
for ((a=0;a<${#LINENUMBER[@]};a++));do
  echo ${CONTENT[${LINENUMBER[$a]}]};
done;

Edited : ops not pures, with bash and using cat too... forget...

Sergei Steshenko 10-28-2008 10:38 AM

Quote:

Originally Posted by ghostdog74 (Post 3323884)
please don't argue. If you want to talk about readability, don't even use Perl. There's Python, but that one aside,

Code:

# awk -F":" 'FNR==NR{_[$1]=$0;next}{print _[$0]}' file_with_blah file_with_number
1: blah blah blah
3: blah blah blah
4: blah blah blah
7: blah blah blah

@OP, usually, there's a way, to use grep , with -f option which i will leave it to you to explore.

I gave an example of readable and self-documented Perl code; Python as a language is not more readable than Perl; it's people who make code unreadable, not languages.

ghostdog74 10-28-2008 11:59 AM

Quote:

Originally Posted by Sergei Steshenko (Post 3324077)
Python as a language is not more readable than Perl;

Code:

linenumbers = open("file_num").readlines()
linenumbers = [i.strip() for i in linenumbers]
for line in open("file_data"):
    num,content = line.split(":")
    if num in linenumbers:
        print line.strip()

you judge for yourself.

Quote:

it's people who make code unreadable, not languages.
that's quite true (although i don't quite like the way you put your braces). however, with the help of language syntax and features, it CAN be even clearer.


All times are GMT -5. The time now is 12:01 AM.