LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 10-27-2008, 11:33 PM   #1
skuz_ball
LQ Newbie
 
Registered: Nov 2007
Location: Melbourne, Australia
Posts: 14

Rep: Reputation: 0
extracting particular lines from a text file


Hello everyone,

I am writing a shell script (c-shell) and want to extract particular lines from a text file. For example I want to extract the following lines

1
3
9
15
16
25
49
etc. etc.

There is no pattern to the line numbers and I have them stored in a single column in another file.

Is there a way for the line numbers to be read from the file that has them stored in a column and then extract those lines from another file?

Any help would be greatly appreciated.

Last edited by skuz_ball; 10-27-2008 at 11:35 PM.
 
Old 10-28-2008, 12:10 AM   #2
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,289

Rep: Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034Reputation: 2034
One way is to use th awk with the NR variable: http://www.grymoire.com/Unix/Awk.html
 
Old 10-28-2008, 12:32 AM   #3
skuz_ball
LQ Newbie
 
Registered: Nov 2007
Location: Melbourne, Australia
Posts: 14

Original Poster
Rep: Reputation: 0
The NR command is helpful if there is a pattern in the lines that I need to extract, for example every second or third line. The lines I need extracted will vary from case to case and there is no particular pattern.

I have the line numbers stored in one file in a column
eg.

1
3
4
7

And I have another file with text, with the line number printed in front (cat -n command used to get the number of the line infront)
eg

1: blah blah blah
2: blah blah blah
3: blah blah blah
4: blah blah blah
5: blah blah blah
6: blah blah blah
7: blah blah blah
8: blah blah blah
9: blah blah blah
10: blah blah blah

How can I extract just the lines that I need?
 
Old 10-28-2008, 12:52 AM   #4
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
If you do not insist in the script being written in shell, it's a trivial task in Perl.

Let me know if you need sample code.
 
Old 10-28-2008, 12:59 AM   #5
skuz_ball
LQ Newbie
 
Registered: Nov 2007
Location: Melbourne, Australia
Posts: 14

Original Poster
Rep: Reputation: 0
Perl will do, I would love to see a sample
 
Old 10-28-2008, 01:47 AM   #6
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by skuz_ball View Post
Perl will do, I would love to see a sample
Well, something like this:

Code:
#!/usr/bin/perl -w

use strict;

my %line_numbers_of_interest = # keys are the lines numbers to be extracted
  (
  25 => '',
  37 => '',
  43 => ''
  );

my $input_file = "input.txt";
my $output_file = "output.txt";

open(my $input_file_fh, '<', $input_file) or die "ERROR cannot open '$input_file' file for reading";

open(my $output_file_fh, '>', $output_file) or die "ERROR cannot open '$output_file' file for writing";

my $input_file_line_number = 1;

while(defined(my $line = <$input_file_fh>))
  {
  if(exists $line_numbers_of_interest{$input_file_line_number})
    {
    print $output_file_fh $line;
    }

  $input_file_line_number++;
  }

close($input_file);
close($output_file);

exit(0);
I haven't tried the code, but it's pretty straightforward. Let me know if you have problems with it.
 
Old 10-28-2008, 02:11 AM   #7
abolishtheun
Member
 
Registered: Mar 2008
Posts: 183

Rep: Reputation: 31
Code:
#!/usr/bin/perl -wnl
BEGIN{
  $nfile = shift or warn "usage: $0 numbers {files}\n" and die 255;
  open(DAT, $nfile) or warn "Could not open file $nfile!" and die 255;
  @nums = <DAT>;  chomp @nums;  close(DAT);
}
/^(\d+)\b/ and grep {$_ eq $1} @nums and print;
Code:
% cat numbers.txt 
1
3
4
7
% cat input.txt 
1: blah blah blah
2: blah blah blah
3: blah blah blah
4: blah blah blah
5: blah blah blah
6: blah blah blah
7: blah blah blah
8: blah blah blah
9: blah blah blah
10: blah blah blah
% perl filter.pl numbers.txt input.txt 
1: blah blah blah
3: blah blah blah
4: blah blah blah
7: blah blah blah
 
Old 10-28-2008, 02:31 AM   #8
burschik
Member
 
Registered: Jul 2008
Posts: 159

Rep: Reputation: 31
Slightly kludgy:

Code:
sed 's/$/ p;/' < lines_file | sed -nf - input_file
 
Old 10-28-2008, 02:39 AM   #9
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by abolishtheun View Post
Code:
#!/usr/bin/perl -wnl
BEGIN{
  $nfile = shift or warn "usage: $0 numbers {files}\n" and die 255;
  open(DAT, $nfile) or warn "Could not open file $nfile!" and die 255;
  @nums = <DAT>;  chomp @nums;  close(DAT);
}
/^(\d+)\b/ and grep {$_ eq $1} @nums and print;
Code:
% cat numbers.txt 
1
3
4
7
% cat input.txt 
1: blah blah blah
2: blah blah blah
3: blah blah blah
4: blah blah blah
5: blah blah blah
6: blah blah blah
7: blah blah blah
8: blah blah blah
9: blah blah blah
10: blah blah blah
% perl filter.pl numbers.txt input.txt 
1: blah blah blah
3: blah blah blah
4: blah blah blah
7: blah blah blah
Everyone is free to write hardly readable Perl code; "grep {$_ eq $1}" search is slow compared to hash key lookup, though in this particular case it won't be significant.

The whole
"
open(DAT, $nfile) or warn "Could not open file $nfile!" and die 255;
@nums = <DAT>;
"

part is not pretty - if we are in Perl, we better import data in Perl format directly.
 
Old 10-28-2008, 02:45 AM   #10
abolishtheun
Member
 
Registered: Mar 2008
Posts: 183

Rep: Reputation: 31
Quote:
Originally Posted by Sergei Steshenko View Post
Everyone is free to write hardly readable Perl code
tmtowtdi. (whats not readable about it?) go rewrite it in machine code if it keeps you up at night.

Last edited by abolishtheun; 10-28-2008 at 02:53 AM.
 
Old 10-28-2008, 03:07 AM   #11
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by abolishtheun View Post
tmtowtdi. (whats not readable about it?) go rewrite it in machine code if it keeps you up at night.
For example, DAT is a senseless name; $nfile is not very clear either.
 
Old 10-28-2008, 06:03 AM   #12
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,696
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
please don't argue. If you want to talk about readability, don't even use Perl. There's Python, but that one aside,

Code:
# awk -F":" 'FNR==NR{_[$1]=$0;next}{print _[$0]}' file_with_blah file_with_number
1: blah blah blah
3: blah blah blah
4: blah blah blah
7: blah blah blah
@OP, usually, there's a way, to use grep , with -f option which i will leave it to you to explore.
 
Old 10-28-2008, 06:48 AM   #13
Jacky Quah
LQ Newbie
 
Registered: May 2005
Posts: 28

Rep: Reputation: 15
If you want to used pure bash script :
Code:
#!/bin/bash
#
#script 
#first argument content file name 
#second argument list of number file name, 

TIFS=$IFS;IFS=$'\n';
LINENUMBER=(`cat ${2}`);
CONTENT=(`cat ${1}`);
IFS=$TIFS;
for ((a=0;a<${#LINENUMBER[@]};a++));do
  echo ${CONTENT[${LINENUMBER[$a]}]};
done;
Edited : ops not pures, with bash and using cat too... forget...

Last edited by Jacky Quah; 10-28-2008 at 10:54 AM.
 
Old 10-28-2008, 10:38 AM   #14
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by ghostdog74 View Post
please don't argue. If you want to talk about readability, don't even use Perl. There's Python, but that one aside,

Code:
# awk -F":" 'FNR==NR{_[$1]=$0;next}{print _[$0]}' file_with_blah file_with_number
1: blah blah blah
3: blah blah blah
4: blah blah blah
7: blah blah blah
@OP, usually, there's a way, to use grep , with -f option which i will leave it to you to explore.
I gave an example of readable and self-documented Perl code; Python as a language is not more readable than Perl; it's people who make code unreadable, not languages.
 
Old 10-28-2008, 11:59 AM   #15
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,696
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Quote:
Originally Posted by Sergei Steshenko View Post
Python as a language is not more readable than Perl;
Code:
linenumbers = open("file_num").readlines()
linenumbers = [i.strip() for i in linenumbers]
for line in open("file_data"):
    num,content = line.split(":")
    if num in linenumbers:
        print line.strip()
you judge for yourself.

Quote:
it's people who make code unreadable, not languages.
that's quite true (although i don't quite like the way you put your braces). however, with the help of language syntax and features, it CAN be even clearer.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
read a line from text file and extracting the details needed pdklinux79 Linux - Newbie 6 06-06-2008 10:41 PM
extracting data from html files into one text file adityavpratap Slackware 9 05-10-2007 10:30 AM
extracting a chunk of text from a large text file lothario Linux - Software 3 02-28-2007 08:16 AM
Grab text lines in text file LULUSNATCH Programming 1 12-02-2005 10:55 AM
Assistance with extracting a series of lines from a log file katsal Linux - General 1 06-21-2005 06:55 PM


All times are GMT -5. The time now is 03:35 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration