LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 03-11-2012, 03:23 PM   #1
paradeboy
LQ Newbie
 
Registered: Mar 2012
Posts: 4

Rep: Reputation: Disabled
Extracting certain lines from a text and outputting to new text files?


Dear forum,

I know about the awk command but I am having a hard time putting it to use on a text file I have.

This text file has the following format:

Quote:
Name Test Score
Jennifer 1 60
Jennifer 2 79
Jennifer 3 30
Jennifer 4 50
Jennifer 5 70
Bob 1 30
Bob 2 60
Bob 3 20
Bob 4 90
Bob 5 80
Joe 1 80
Joe 2 60
Joe 3 60
Joe 4 70
Joe 5 70
...
I would like to make new text files for each of the test scores (5 total), such that the new text files look like the following:

Text file 1 for Test 1 scores
Quote:
Name Test Score
Jennifer 1 60
Bob 1 30
Joe 1 80
...
Text file 2 for Test 2 scores
Quote:
Name Test Score
Jennifer 2 79
Bob 2 60
Joe 2 60
...
etc.

Is there a way to do this using just awk or is there something else that would be needed? Can awk do sequence extraction... not sure if that's the right wording, what I mean is that every nth line is output (line 1, 6, 11, 16... to text file 1; lines 2, 7, 12, 17... to text file 2, etc.) rather than searching for value 2 in the second column and outputting those that match to text file 2. Hope this made some sense..

Thanks for your help! =)
 
Old 03-11-2012, 04:42 PM   #2
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
Hi and welcome to LinuxQuestions!

Checking the value in the second field would be the most straightforward method, anyway here we go:
Code:
BEGIN {

  getline
  
  for ( i = 1; i <= 5; i++ )
    print > i ".txt"
  
}

NR > 1 {

  file = (NR - 2) % 5 + 1 ".txt"
  
  print > file
  
}
The BEGIN section "initialize" the files printing out the header. The expression NR > 1 skips the header itself, then you can simply use an algorithm to compute the file name according to the current record number. You can easily adapt this code for any number of tests, provided the input file has the same format. Hope this helps.
 
Old 03-11-2012, 05:26 PM   #3
paradeboy
LQ Newbie
 
Registered: Mar 2012
Posts: 4

Original Poster
Rep: Reputation: Disabled
Thank you very much for your help colucix!
 
Old 03-13-2012, 08:30 PM   #4
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,482

Rep: Reputation: 411Reputation: 411Reputation: 411Reputation: 411Reputation: 411
Here's another take on solving this "data distribution" problem.

This code assumes your input file has a name of the form
"/home/daniel/Desktop/LQfiles/dbm266inp.txt"
The script file is dbm266.bin and the input file is dbm266inp.txt.
Use your own path name and program name but the "inp.txt" is important.

Most of the code (below) is setup and comments.
The real work is all on one line, the awk.

Code:
#   Daniel B. Martin   Mar12
#
#   To execute this program, launch a terminal sesson and enter:
#   bash /home/daniel/Desktop/LQfiles/dbm266.bin
#
#   This program was inspired by:
#   http://www.linuxquestions.org/questions/linux-newbie-8/
#    extracting-certain-lines-from-a-text-and-outputting-to
#    -new-text-files-933921/


# Input file identification  
InFile='/home/daniel/Desktop/LQfiles/dbm266inp.txt'
echo
echo "The input file is:"
echo $InFile

# Output file identification 
# PF = Prefix
PF=$(echo $InFile |sed -e 's\inp.txt$\\')'out'
echo; echo "Output will be written to files with names of the form:"
echo $PF"x.txt where x is any alphanumeric."


# This awk deals out the input records to one or more output files
# according to the character in field 2.
awk -v pf="$PF" '{print >pf$2".txt"}' $InFile

echo; echo "Normal end of job."; echo
exit
Daniel B. Martin
 
Old 03-14-2012, 01:02 AM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,243

Rep: Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684
Here is the same idea:
Code:
awk 'NR > 1{print > $2".txt"}' file
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Need help extracting text from .htm files roBuntu1967 Linux - Newbie 3 03-07-2011 07:24 AM
bash text to variable accessing individual text lines patolfo Programming 11 05-11-2010 11:21 AM
extracting particular lines from a text file skuz_ball Programming 18 10-28-2008 01:31 PM
extracting data from html files into one text file adityavpratap Slackware 9 05-10-2007 11:30 AM
extracting a chunk of text from a large text file lothario Linux - Software 3 02-28-2007 09:16 AM


All times are GMT -5. The time now is 06:47 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration