LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 03-11-2012, 02:23 PM   #1
paradeboy
LQ Newbie
 
Registered: Mar 2012
Posts: 4

Rep: Reputation: Disabled
Extracting certain lines from a text and outputting to new text files?


Dear forum,

I know about the awk command but I am having a hard time putting it to use on a text file I have.

This text file has the following format:

Quote:
Name Test Score
Jennifer 1 60
Jennifer 2 79
Jennifer 3 30
Jennifer 4 50
Jennifer 5 70
Bob 1 30
Bob 2 60
Bob 3 20
Bob 4 90
Bob 5 80
Joe 1 80
Joe 2 60
Joe 3 60
Joe 4 70
Joe 5 70
...
I would like to make new text files for each of the test scores (5 total), such that the new text files look like the following:

Text file 1 for Test 1 scores
Quote:
Name Test Score
Jennifer 1 60
Bob 1 30
Joe 1 80
...
Text file 2 for Test 2 scores
Quote:
Name Test Score
Jennifer 2 79
Bob 2 60
Joe 2 60
...
etc.

Is there a way to do this using just awk or is there something else that would be needed? Can awk do sequence extraction... not sure if that's the right wording, what I mean is that every nth line is output (line 1, 6, 11, 16... to text file 1; lines 2, 7, 12, 17... to text file 2, etc.) rather than searching for value 2 in the second column and outputting those that match to text file 2. Hope this made some sense..

Thanks for your help! =)
 
Old 03-11-2012, 03:42 PM   #2
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Hi and welcome to LinuxQuestions!

Checking the value in the second field would be the most straightforward method, anyway here we go:
Code:
BEGIN {

  getline
  
  for ( i = 1; i <= 5; i++ )
    print > i ".txt"
  
}

NR > 1 {

  file = (NR - 2) % 5 + 1 ".txt"
  
  print > file
  
}
The BEGIN section "initialize" the files printing out the header. The expression NR > 1 skips the header itself, then you can simply use an algorithm to compute the file name according to the current record number. You can easily adapt this code for any number of tests, provided the input file has the same format. Hope this helps.
 
Old 03-11-2012, 04:26 PM   #3
paradeboy
LQ Newbie
 
Registered: Mar 2012
Posts: 4

Original Poster
Rep: Reputation: Disabled
Thank you very much for your help colucix!
 
Old 03-13-2012, 07:30 PM   #4
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Here's another take on solving this "data distribution" problem.

This code assumes your input file has a name of the form
"/home/daniel/Desktop/LQfiles/dbm266inp.txt"
The script file is dbm266.bin and the input file is dbm266inp.txt.
Use your own path name and program name but the "inp.txt" is important.

Most of the code (below) is setup and comments.
The real work is all on one line, the awk.

Code:
#   Daniel B. Martin   Mar12
#
#   To execute this program, launch a terminal sesson and enter:
#   bash /home/daniel/Desktop/LQfiles/dbm266.bin
#
#   This program was inspired by:
#   http://www.linuxquestions.org/questions/linux-newbie-8/
#    extracting-certain-lines-from-a-text-and-outputting-to
#    -new-text-files-933921/


# Input file identification  
InFile='/home/daniel/Desktop/LQfiles/dbm266inp.txt'
echo
echo "The input file is:"
echo $InFile

# Output file identification 
# PF = Prefix
PF=$(echo $InFile |sed -e 's\inp.txt$\\')'out'
echo; echo "Output will be written to files with names of the form:"
echo $PF"x.txt where x is any alphanumeric."


# This awk deals out the input records to one or more output files
# according to the character in field 2.
awk -v pf="$PF" '{print >pf$2".txt"}' $InFile

echo; echo "Normal end of job."; echo
exit
Daniel B. Martin
 
Old 03-14-2012, 12:02 AM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
Here is the same idea:
Code:
awk 'NR > 1{print > $2".txt"}' file
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Need help extracting text from .htm files roBuntu1967 Linux - Newbie 3 03-07-2011 06:24 AM
bash text to variable accessing individual text lines patolfo Programming 11 05-11-2010 10:21 AM
extracting particular lines from a text file skuz_ball Programming 18 10-28-2008 12:31 PM
extracting data from html files into one text file adityavpratap Slackware 9 05-10-2007 10:30 AM
extracting a chunk of text from a large text file lothario Linux - Software 3 02-28-2007 08:16 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 10:41 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration