LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 10-03-2007, 11:38 AM   #1
Ransak
Member
 
Registered: Nov 2005
Posts: 35

Rep: Reputation: 15
Awk line method?


Hi,

I'm trying to use Awk to parse and reformat a report. The report uses pipes as delimiters.

Report lines consist of:
Code:
name|age|notes|risk
For example:
Code:
John Smith|53|Benign Paroxysmal Positional Vertigo (BPPV)|High
Jane Doe|48|Benign Paroxysmal Positional Vertigo (BPPV)|High
Paul Doe|52|Superior Canal Dehiscence Syndrome|Medium
What I'm looking to accomplish is to be able to sort the report differently. For example, I'd like to list a note, filter out any duplicates of the note, then list all names that are associated in the report. Example:

Code:
Benign Paroxysmal Positional Vertigo (BPPV)
John Smith
Jane Doe
Thus far, I've written a script that accomplishes some of what I'm trying to do.

Code:
#!/bin/bash

if [ $# -ne 1 ]; then
        echo "Please supply a file."
        exit 127
fi

grep High $1 | awk -F \| '{print $3}' | sort -u
The problem is this simply parses the file and pulls out all of the relevant High risks, but I'm not certain how to get the names (field 1) to display under each risk that is associated. I could pull out the names of a specific risk awk-ing {print $1} as a separate chunk, but I'd like to figure out how to automate the two together.

Ideally I'd like to process a single line, pull out the notes, and place them in a variable but I'm not sure of what the method is for this, or even if it's the best way to tackle this problem.

Any thoughts on how best to accomplish this?

Thanks in advance...

Last edited by Ransak; 10-03-2007 at 11:49 AM.
 
Old 10-03-2007, 11:51 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371
Hi,

Something like this should work:

awk -F'|' '/High/ { print $1 }' infile.

Using your 'infile':
Code:
$ cat infile 
John Smith|53|Benign Paroxysmal Positional Vertigo (BPPV)|High
Jane Doe|48|Benign Paroxysmal Positional Vertigo (BPPV)|High
Paul Doe|52|Superior Canal Dehiscence Syndrome|Medium

$ awk -F'|' '/High/ { print $1 }' infile 
John Smith
Jane Doe

$ awk -F'|' '/Benign Paroxysmal Positional Vertigo/ { print $1 }' infile | sort -u
Jane Doe
John Smith
BTW, you do need single quotes around the | in the -F'|' part, otherwise the shell (probably bash) will see it as a pipe and things will go wrong.

Hope this helps.
 
Old 10-03-2007, 12:09 PM   #3
Ransak
Member
 
Registered: Nov 2005
Posts: 35

Original Poster
Rep: Reputation: 15
Thanks for the reply

As I mentioned, I can pull out the names with no problem, similarly to how you've listed. I'd like to automate the two however to pull out the risks, eliminate the duplicates, and list the names underneath the individual risks (similar to the third code example in my post).

A method of taking each individual risk and sticking them in a variable would work to this end, but I'm not sure how to accomplish it. When I've attempted this, it just pulls all of the risks into one variable.

Oddly enough, the code I've listed works without the single quotes around the pipe. Probably not great form, however.


Quote:
Originally Posted by druuna View Post
Hi,

Something like this should work:

awk -F'|' '/High/ { print $1 }' infile.

Using your 'infile':
Code:
$ cat infile 
John Smith|53|Benign Paroxysmal Positional Vertigo (BPPV)|High
Jane Doe|48|Benign Paroxysmal Positional Vertigo (BPPV)|High
Paul Doe|52|Superior Canal Dehiscence Syndrome|Medium

$ awk -F'|' '/High/ { print $1 }' infile 
John Smith
Jane Doe

$ awk -F'|' '/Benign Paroxysmal Positional Vertigo/ { print $1 }' infile | sort -u
Jane Doe
John Smith
BTW, you do need single quotes around the | in the -F'|' part, otherwise the shell (probably bash) will see it as a pipe and things will go wrong.

Hope this helps.

Last edited by Ransak; 10-03-2007 at 12:10 PM.
 
Old 10-03-2007, 01:07 PM   #4
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371
Hi again,

Maybe this will help:

Code:
#!/bin/bash

if [ $# -ne 1 ]; then
        echo "Please supply a file."
        exit 127
fi

sort -t'|' -k3 $1 | \
awk '
BEGIN { FS = "|" ; previousOne = "" }
{
  if ( $4 == "High" )
  { 
    if ( $3 != previousOne )
    { printf("%s\n", $3) }
    print $1 
    previousOne = $3
  }
}'
An example run with a more elaborate infile:
Code:
$ cat infile
John Smith|53|Benign Paroxysmal Positional Vertigo (BPPV)|High
Xara Zane|31|A Different Term Here (ADTH)|High
Paul Doe|52|Superior Canal Dehiscence Syndrome|Medium
Sara Waltz|28|A Different Term Here (ADTH)|High
Zeta Jones|37|This Is No Good (TING)|Medium
Jane Doe|48|Benign Paroxysmal Positional Vertigo (BPPV)|High

 $ ./awk.prog.sh infile 
A Different Term Here (ADTH)
Sara Waltz
Xara Zane
Benign Paroxysmal Positional Vertigo (BPPV)
Jane Doe
John Smith
For this (the awk part) to work you need to sort the infile on the third field. This probably can be done from within awk, but the sort command was easier to come up with

Hope this helps.
 
Old 10-03-2007, 01:56 PM   #5
Ransak
Member
 
Registered: Nov 2005
Posts: 35

Original Poster
Rep: Reputation: 15
Yes! That's exactly what I was looking for. I had no idea (and Google wasn't much help) on how to run a if within an awk statement. I knew it had to be possible, but couldn't piece it together myself.

Thanks druuna, I really appreciate the hand!

Quote:
Originally Posted by druuna View Post
Hi again,

Maybe this will help:

Code:
#!/bin/bash

if [ $# -ne 1 ]; then
        echo "Please supply a file."
        exit 127
fi

sort -t'|' -k3 $1 | \
awk '
BEGIN { FS = "|" ; previousOne = "" }
{
  if ( $4 == "High" )
  { 
    if ( $3 != previousOne )
    { printf("%s\n", $3) }
    print $1 
    previousOne = $3
  }
}'
An example run with a more elaborate infile:
Code:
$ cat infile
John Smith|53|Benign Paroxysmal Positional Vertigo (BPPV)|High
Xara Zane|31|A Different Term Here (ADTH)|High
Paul Doe|52|Superior Canal Dehiscence Syndrome|Medium
Sara Waltz|28|A Different Term Here (ADTH)|High
Zeta Jones|37|This Is No Good (TING)|Medium
Jane Doe|48|Benign Paroxysmal Positional Vertigo (BPPV)|High

 $ ./awk.prog.sh infile 
A Different Term Here (ADTH)
Sara Waltz
Xara Zane
Benign Paroxysmal Positional Vertigo (BPPV)
Jane Doe
John Smith
For this (the awk part) to work you need to sort the infile on the third field. This probably can be done from within awk, but the sort command was easier to come up with

Hope this helps.
 
Old 10-03-2007, 02:17 PM   #6
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371
Hi,

You're welcome.

Here's a link to The GNU Awk User's Guide, tells you everything you always wanted to know
 
  


Reply

Tags
awk, script, shell


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
removing first line with AWK koobi Programming 7 08-03-2006 01:48 AM
how to select a line using awk sharad Linux - Software 5 04-05-2006 09:26 AM
Awk - get a parameter from the command line benjalien Programming 1 01-24-2006 09:06 AM
Awk command-line arguments lowpro2k3 Programming 1 03-28-2005 09:09 PM
Deleting a line with gawk/awk caps_phisto Linux - General 4 11-06-2004 02:31 PM


All times are GMT -5. The time now is 05:28 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration