LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   How to convert newline to tab for values in duplicate rows (https://www.linuxquestions.org/questions/linux-newbie-8/how-to-convert-newline-to-tab-for-values-in-duplicate-rows-4175445548/)

jv61 01-14-2013 10:34 AM

How to convert newline to tab for values in duplicate rows
 
I have a file which looks like this
Code:

Reference Value
Con1      BC1:10
Con1      BC2:2
Con2      BC1:80
Con2      BC2:40
Con3      BC1:3
Con4      BC1:1
Con4      BC2:12

I would like to print the duplicate values in column only once and their corresponding values converted to tab delimited. So I am expecting my results to look like this

Code:

Reference Value
Con1      BC1:10 BC2:2
Con2      BC1:80 BC2:40
Con3      BC1:3
Con4      BC1:1  BC2:12

Any ideas of how to get this using awk/sed/perl ?

Thanks in advance,

rmacd 01-14-2013 10:59 AM

Code:

cat <filename> | awk -F' ' '{if($1 in a) {a[$1]=a[$1] " " $NF} else {a[$1]=$0}} END {asort(a); for(x in a) print a[x]}' | sort
HTH

jpollard 01-14-2013 05:29 PM

Depending on the file size you could run out of memory.

As long as the input is sorted the following will do the work without saving the entire file in memory:

Code:

#!/bin/bash
awk '
BEGIN  {v=""; line=""}
        {
          if (v == $1) {
            line = line " " $2;
          } else {
            if (line != "") {print line};
            line = $0;
            v = $1;
          }
        }
END    { print line }'

Make the script executable and you can redirect input to it (or use a pipe from sort) and redirect output to another file.

jv61 01-15-2013 03:44 AM

Many thanks for the replies. Both the answers solved my problem. Could you please also explain what each section of the code does, that will be helpful for better understanding as I am a beginner in learning to program in awk. Thank you

grail 01-15-2013 03:53 AM

Maybe something like:
Code:

awk '$0=($1==a)?"\t"$2:"\n"$0;{a=$1}' ORS="" file

jpollard 01-15-2013 05:11 AM

Code:

awk '
BEGIN        {v=""; line=""}                # initialization - v is empty key, line empty
        {                        # for every input record:
          if (v == $1) {        # if field 1 (a key) matches saved key
            line = line " " $2;        # then append field 2 to the current line
          } else {                # else a new key is seen
            if (line != "") {print line};# print the current line if line not empty
            line = $0;                # save the new line
            v = $1;                # and the new key
          }
        }
END        { print line }'                # at end of file print the last line


jv61 01-15-2013 12:45 PM

Thanks for the explanation, that's helpful.

syg00 01-15-2013 03:34 PM

I always enjoy (and learn from) grails attempts to turn awk into a (pale) imitation of perls minimalist approach .... :p

In this case an extra newline at the end might be appropriate.

jpollard 01-15-2013 07:50 PM

It is also nearly incomprehensible.

grail 01-16-2013 01:23 AM

I just like to play :)


All times are GMT -5. The time now is 07:27 AM.