LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-10-2012, 12:38 AM   #1
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Rep: Reputation: Disabled
Issue for going back to previous line under conditions


I have this input:
Code:
Joe|info.1
Bob|info.1
Bob|info.2
I would like to write the different info about the same person on the same line like that:
Code:
Joe|info.1
Bob|info.1|info.2
I tried:
Code:
awk 'BEGIN{FS=OFS="|"} {if(a[$1]++ == 0) {print; stored = $0}; else if(a[$1]++ > 0) print stored FS $2}'
But I get the duplicate original info:
Code:
Joe|info.1
Bob|info.1
Bob|info.1|info.2
It's because I print the first if statement, but if I don't I don't have the first line...
Any advice !

Thanks in advance
 
Old 07-10-2012, 12:52 AM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,508

Rep: Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890
As a quick alternative:
Code:
awk -F"|" 'NR==1{printf $0}x{if(x!=$1)printf "\n%s",$0;else printf "|%s",$2}{x=$1}' file
 
Old 07-10-2012, 01:09 AM   #3
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
Thanks for your help, but this alternative is too quick ! :-)

It doesn't work for me...

The point here is to say if a[$1] exist only once then print the entire line, and if a[$1] exist more than once then go back to the first occurence and add the supplementary fields from the next occurrences.

Last edited by Trd300; 07-10-2012 at 01:32 AM.
 
Old 07-10-2012, 02:14 AM   #4
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959
How about something like this?

Code:
#example file input
$ cat file.txt
Joe|info.1
Bob|info.1
Bob|info.2
David|info.1
Bob|info.3
Grail|info.1
David|info.2
Trd|info.1
Foo|info.1

$ awk 'BEGIN{ FS="|" } { a[$1]=(a[$1]?a[$1]:$1) FS $2 } END{ for (i in a){ print a[i] } }' file.txt
Foo|info.1
Grail|info.1
David|info.1|info.2
Bob|info.1|info.2|info.3
Trd|info.1
Joe|info.1
Caveats are that it assumes there are only two fields per line, and the output is (as you can see) unsorted in relation to the original, due to awk's internal array index tracking.

Last edited by David the H.; 07-10-2012 at 02:19 AM. Reason: formatting clean-up
 
1 members found this post helpful.
Old 07-10-2012, 03:10 AM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,508

Rep: Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890
I guess from the initial data I was of the understanding the data was sorted by column 1 (hence my suggested solution).

Your current process obviously cannot work as using print will leave the line intact but not allow for additional entries to be added.
Therefore, in an unsorted list (although will of course work for sorted, but requires storing before printing), David's solution is the way to go
Further to David's solution, if name order were important you could use an asorti in the END solution.
 
Old 07-10-2012, 08:53 AM   #6
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959
Yeah, maybe I should've added that to my caveats. If the list is unsorted, then you're going to have to store every line in memory and print everything out at the end. That's not a problem for small amounts of input text, but it won't work if there's more than the system memory can handle.

As for controlling array sorting, see here:
http://www.gnu.org/software/gawk/man...y-Sorting.html

Rather than using asorti though, if you want the output sorted alphabetically, for example, you can simply add a PROCINFO setting to the BEGIN section:

Code:
BEGIN{ PROCINFO["sorted_in"]="@ind_str_asc" ; FS=OFS="|" }
Note that only recent versions of gawk can do this. older gawk and other awk implementations don't have any sorting features built-in, and you'd have to manually roll your own index tracking function. You'll also have to do so if you need the output order to be identical to the input, and it isn't already in one of the pre-set sorting types.

(And wouldn't it be nice if the gawk developers added a setting or two for "input order"?)
 
Old 07-10-2012, 07:14 PM   #7
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
Thanks David & grail !

The order it returns the output doesn't really matter.

I didn't know this syntax:
Code:
{ a[$i]=(a[$i]?a[$i]:$i) FS $j }
it's very handy, and asorti and PROCINFO as well.

Thanks guys !

Last edited by Trd300; 07-10-2012 at 09:48 PM.
 
Old 07-10-2012, 09:47 PM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,508

Rep: Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890Reputation: 2890
If it is sorted then my solution negates having to store the data.
 
Old 07-11-2012, 10:42 AM   #9
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959Reputation: 1959
Quote:
Originally Posted by Trd300 View Post
I didn't know this syntax:
Code:
{ a[$i]=(a[$i]?a[$i]:$i) FS $j }
it's very handy, and asorti and PROCINFO as well.
Yeah, "condition?value1:value2" is the ternary operator, a kind of short form of if/then/else.

In this case, if a previously-set value for array entry "a[$i]" exists, then use it, otherwise use "$i".
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Perl script to replace a line after finding a previous line Mark1986 Programming 1 02-28-2011 05:09 PM
Help scripting to find line and print previous line to out jamieofansa Programming 4 05-21-2010 01:30 PM
Attempting to append a line of text to the end of the previous line market_garden Linux - General 4 12-11-2008 11:37 AM
Go back to previous kernel? keith2045 Linux - General 9 11-07-2007 11:25 PM
how to revert back to the previous kernel version? prav_284 Red Hat 3 12-10-2003 03:51 AM


All times are GMT -5. The time now is 03:17 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration