LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Using awk to cut fields and reduce dupes (https://www.linuxquestions.org/questions/linux-newbie-8/using-awk-to-cut-fields-and-reduce-dupes-4175437292/)

smturner1 11-15-2012 11:14 AM

Using awk to cut fields and reduce dupes
 
First, let me provide some sample data.
Code:

Sam,,"Sam","I am"
Sam,,"Sam","I don't like"
Sam,,"Sam","Green eggs and ham"
Snitches,,"Snitches","I have a star"
Snitches,,"Snitches","You wish you had a star"
Snitches,,"Snitches","Lets pay the 3 dollars"

Basically I want to cut the first and fourth field out of the data eliminating any dupes in the first field. This is what the data should look like (formatting not an issue).
Sam "I am"
"I don't like"
"Green eggs and ham"

This is what I wrote and all I am getting is field 4. I think I need to add the 'else' into the 'if' statement, but I dont know how to go about it.

Code:

sort -d <file> | awk -F"," '{ if ($1 != last_name_seen) {print $4; last_name_seen=$1}}'
Any input would be appreciated.

S

smturner1 11-15-2012 12:15 PM

I tried this:

Quote:

sort -d <file> | awk -F"," '{ if ( $1 != last_name_seen ) { print $1, $4; last_name_seen=$1 } else { print $4 }}'
It did not work. It printed:
Quote:

Sam "I am"
Sam "I don't like"
Sam "Green eggs and ham"
As previously stated I want the data to look like this;
Quote:

Sam "I am"
"I don't like"
"Green eggs and ham"
Remember, the formatting is not the focus, only the output.

ntubski 11-15-2012 03:40 PM

You don't need an else, you just need to separate printing $1 (which you only want sometimes) from printing $4 (which you always want to do):

Code:

awk -F, '($1 != prev_name){printf("%s ", $1)} {print $4; prev_name = $1}'

## which is short for:
awk -F, '{
    if ($1 != prev_name)
      printf("%s ", $1);
    print $4;
    prev_name = $1;
  }'


smturner1 11-16-2012 09:11 AM

ntubski,

It works! Thank you.


All times are GMT -5. The time now is 12:01 PM.