LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 11-04-2008, 12:45 AM   #1
cgcamal
Member
 
Registered: Nov 2008
Location: Tegucigalpa
Posts: 72

Rep: Reputation: 16
Extract lines containing some strings without affectting sequential order


Hi to all,

My first question is about awk.

Maybe someone help me with this.


I have an large text file that contain 2 columns of info in blocks, in the first
column 122 different strings repeated in all file.

Id like to extract all lines that only contain 23 words of that 122,
deleting the all lines containing those 99 strings that I dont need or
extracting all lines containing the other 23 strings.

Whatever method must preserve the original sequential order
of lines.

I was trying with


Code:
awk '/word1//word2/' source.txt > filtered.txt
But this method dont let me do the "OR" from word1 to word 23.

How can I do this filtering without affectting the order of the
ouput lines?

Thanks in advance for any suggestion.

Best regards.

 
Old 11-04-2008, 01:29 AM   #2
burschik
Member
 
Registered: Jul 2008
Posts: 159

Rep: Reputation: 31
awk does not reorder its input unless you tell it to. It reads the file(s) and applies all rules to each line of input in sequence. Thus, the order always stays the same. This means you can just write:

Code:
awk '/pattern1/||/pattern2/ { print; }'
 
Old 11-04-2008, 09:05 AM   #3
cgcamal
Member
 
Registered: Nov 2008
Location: Tegucigalpa
Posts: 72

Original Poster
Rep: Reputation: 16
Hi burschik,

Thanks for answer, but in addition of that,

How can I do an extraction of all lines that contain 23 strings in the file with awk?

The analogy from your example for me would be:


awk '/word1//word2//.....//word22//word23/' source.txt > filtered.txt


But I think is this command is too large and it not accepted by the command line.


Thanks for any suggestion.
 
Old 11-05-2008, 03:28 PM   #4
jan61
Member
 
Registered: Jun 2008
Posts: 235

Rep: Reputation: 46
Moin,

You want to output every line containing at least one of the 23 words you look for?

Code:
VAR="word1
word2
...
word23"
awk -v var="$var" ' BEGIN { split(var, my_arr); }
{ for (a in my_arr) {
    if (index($0, a) > 0) {
      print $0;
      break;
    }
 } ' file
Jan
 
Old 11-06-2008, 01:47 AM   #5
burschik
Member
 
Registered: Jul 2008
Posts: 159

Rep: Reputation: 31
Quote:
Originally Posted by cgcamal View Post
Hi burschik,

Thanks for answer, but in addition of that,

How can I do an extraction of all lines that contain 23 strings in the file with awk?

The analogy from your example for me would be:


awk '/word1//word2//.....//word22//word23/' source.txt > filtered.txt


But I think is this command is too large and it not accepted by the command line.


Thanks for any suggestion.
I'd be very surprised if 23 words were to exceed the maximum length of the command line. And if I am not mistaken, the limit was abolished in 2.6.23.
 
Old 11-06-2008, 11:08 AM   #6
cgcamal
Member
 
Registered: Nov 2008
Location: Tegucigalpa
Posts: 72

Original Poster
Rep: Reputation: 16
Hi guys, thanks for your answers,

burschik,

Thanks, Ive put all words in a text file(script.txt) with the syntax you gave me,

Code:
awk '/pattern1/||/pattern2/'
then I sent the script and it worked nice.
Code:
awk -f script.txt source_file.txt > result_file.txt

jan61,

How can I run the code youve written? Which command I have to use?
I imagine it is a similar syntax like I sent the script with awk.

Thanks in advance.
 
Old 11-06-2008, 01:31 PM   #7
jan61
Member
 
Registered: Jun 2008
Posts: 235

Rep: Reputation: 46
Moin,

Quote:
Originally Posted by cgcamal View Post
...jan61,

How can I run the code youve written? Which command I have to use?
I imagine it is a similar syntax like I sent the script with awk.
it's a simple shell script, but you can type it even on the shell prompt.

Jan
 
Old 11-06-2008, 11:57 PM   #8
cgcamal
Member
 
Registered: Nov 2008
Location: Tegucigalpa
Posts: 72

Original Poster
Rep: Reputation: 16
Ok jan61, I will try that.

Many thanks to all.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Extract lines NOT on a block of text from a file Renan_S2 Programming 3 10-05-2008 04:14 PM
Awk Question to search specific strings grouped by blank lines rk4k Programming 6 07-07-2008 11:56 PM
How to pull sequential lines from a file? WingnutOne Linux - Newbie 11 09-07-2007 10:37 AM
Helix seems to die when running Extract Strings abefroman Linux - Security 0 08-04-2005 09:30 AM
how to extract certain lines from a log file Avatar Linux - Newbie 3 02-11-2005 09:51 AM


All times are GMT -5. The time now is 06:22 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration