LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   Awk line method? (http://www.linuxquestions.org/questions/programming-9/awk-line-method-589151/)

Ransak 10-03-2007 11:38 AM

Awk line method?
 
Hi,

I'm trying to use Awk to parse and reformat a report. The report uses pipes as delimiters.

Report lines consist of:
Code:

name|age|notes|risk
For example:
Code:

John Smith|53|Benign Paroxysmal Positional Vertigo (BPPV)|High
Jane Doe|48|Benign Paroxysmal Positional Vertigo (BPPV)|High
Paul Doe|52|Superior Canal Dehiscence Syndrome|Medium

What I'm looking to accomplish is to be able to sort the report differently. For example, I'd like to list a note, filter out any duplicates of the note, then list all names that are associated in the report. Example:

Code:

Benign Paroxysmal Positional Vertigo (BPPV)
John Smith
Jane Doe

Thus far, I've written a script that accomplishes some of what I'm trying to do.

Code:

#!/bin/bash

if [ $# -ne 1 ]; then
        echo "Please supply a file."
        exit 127
fi

grep High $1 | awk -F \| '{print $3}' | sort -u

The problem is this simply parses the file and pulls out all of the relevant High risks, but I'm not certain how to get the names (field 1) to display under each risk that is associated. I could pull out the names of a specific risk awk-ing {print $1} as a separate chunk, but I'd like to figure out how to automate the two together.

Ideally I'd like to process a single line, pull out the notes, and place them in a variable but I'm not sure of what the method is for this, or even if it's the best way to tackle this problem.

Any thoughts on how best to accomplish this?

Thanks in advance...

druuna 10-03-2007 11:51 AM

Hi,

Something like this should work:

awk -F'|' '/High/ { print $1 }' infile.

Using your 'infile':
Code:

$ cat infile
John Smith|53|Benign Paroxysmal Positional Vertigo (BPPV)|High
Jane Doe|48|Benign Paroxysmal Positional Vertigo (BPPV)|High
Paul Doe|52|Superior Canal Dehiscence Syndrome|Medium

$ awk -F'|' '/High/ { print $1 }' infile
John Smith
Jane Doe

$ awk -F'|' '/Benign Paroxysmal Positional Vertigo/ { print $1 }' infile | sort -u
Jane Doe
John Smith

BTW, you do need single quotes around the | in the -F'|' part, otherwise the shell (probably bash) will see it as a pipe and things will go wrong.

Hope this helps.

Ransak 10-03-2007 12:09 PM

Thanks for the reply ;)

As I mentioned, I can pull out the names with no problem, similarly to how you've listed. I'd like to automate the two however to pull out the risks, eliminate the duplicates, and list the names underneath the individual risks (similar to the third code example in my post).

A method of taking each individual risk and sticking them in a variable would work to this end, but I'm not sure how to accomplish it. When I've attempted this, it just pulls all of the risks into one variable.

Oddly enough, the code I've listed works without the single quotes around the pipe. Probably not great form, however.


Quote:

Originally Posted by druuna (Post 2911968)
Hi,

Something like this should work:

awk -F'|' '/High/ { print $1 }' infile.

Using your 'infile':
Code:

$ cat infile
John Smith|53|Benign Paroxysmal Positional Vertigo (BPPV)|High
Jane Doe|48|Benign Paroxysmal Positional Vertigo (BPPV)|High
Paul Doe|52|Superior Canal Dehiscence Syndrome|Medium

$ awk -F'|' '/High/ { print $1 }' infile
John Smith
Jane Doe

$ awk -F'|' '/Benign Paroxysmal Positional Vertigo/ { print $1 }' infile | sort -u
Jane Doe
John Smith

BTW, you do need single quotes around the | in the -F'|' part, otherwise the shell (probably bash) will see it as a pipe and things will go wrong.

Hope this helps.


druuna 10-03-2007 01:07 PM

Hi again,

Maybe this will help:

Code:

#!/bin/bash

if [ $# -ne 1 ]; then
        echo "Please supply a file."
        exit 127
fi

sort -t'|' -k3 $1 | \
awk '
BEGIN { FS = "|" ; previousOne = "" }
{
  if ( $4 == "High" )
  {
    if ( $3 != previousOne )
    { printf("%s\n", $3) }
    print $1
    previousOne = $3
  }
}'

An example run with a more elaborate infile:
Code:

$ cat infile
John Smith|53|Benign Paroxysmal Positional Vertigo (BPPV)|High
Xara Zane|31|A Different Term Here (ADTH)|High
Paul Doe|52|Superior Canal Dehiscence Syndrome|Medium
Sara Waltz|28|A Different Term Here (ADTH)|High
Zeta Jones|37|This Is No Good (TING)|Medium
Jane Doe|48|Benign Paroxysmal Positional Vertigo (BPPV)|High

 $ ./awk.prog.sh infile
A Different Term Here (ADTH)
Sara Waltz
Xara Zane
Benign Paroxysmal Positional Vertigo (BPPV)
Jane Doe
John Smith

For this (the awk part) to work you need to sort the infile on the third field. This probably can be done from within awk, but the sort command was easier to come up with :)

Hope this helps.

Ransak 10-03-2007 01:56 PM

Yes! That's exactly what I was looking for. I had no idea (and Google wasn't much help) on how to run a if within an awk statement. I knew it had to be possible, but couldn't piece it together myself.

Thanks druuna, I really appreciate the hand!

Quote:

Originally Posted by druuna (Post 2912038)
Hi again,

Maybe this will help:

Code:

#!/bin/bash

if [ $# -ne 1 ]; then
        echo "Please supply a file."
        exit 127
fi

sort -t'|' -k3 $1 | \
awk '
BEGIN { FS = "|" ; previousOne = "" }
{
  if ( $4 == "High" )
  {
    if ( $3 != previousOne )
    { printf("%s\n", $3) }
    print $1
    previousOne = $3
  }
}'

An example run with a more elaborate infile:
Code:

$ cat infile
John Smith|53|Benign Paroxysmal Positional Vertigo (BPPV)|High
Xara Zane|31|A Different Term Here (ADTH)|High
Paul Doe|52|Superior Canal Dehiscence Syndrome|Medium
Sara Waltz|28|A Different Term Here (ADTH)|High
Zeta Jones|37|This Is No Good (TING)|Medium
Jane Doe|48|Benign Paroxysmal Positional Vertigo (BPPV)|High

 $ ./awk.prog.sh infile
A Different Term Here (ADTH)
Sara Waltz
Xara Zane
Benign Paroxysmal Positional Vertigo (BPPV)
Jane Doe
John Smith

For this (the awk part) to work you need to sort the infile on the third field. This probably can be done from within awk, but the sort command was easier to come up with :)

Hope this helps.


druuna 10-03-2007 02:17 PM

Hi,

You're welcome.

Here's a link to The GNU Awk User's Guide, tells you everything you always wanted to know ;)


All times are GMT -5. The time now is 07:24 PM.