LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-27-2018, 05:44 PM   #1
maddyfreaks
Member
 
Registered: May 2011
Posts: 85

Rep: Reputation: 0
Help with awk


Hi Team

I have a file with below info
Code:
 COL1  | COL2 | COL3
----------------------
A1     | 98   | P
A1     | 98   | P
A1     | 98   | P
B1     | 98   | P
B1     | 98   | P
B1     | 98   | P
C1     | 98   | P
C1     | 98   | P
C1     | 98   | P

need to convert and the awk/sed need to be applied on col1 only 


 COL1  | COL2 | COL3
----------------------
A1     | 98   | P
       | 98   | P
       | 98   | P
B1     | 98   | P
       | 98   | P
       | 98   | P
C1     | 98   | P
       | 98   | P
       | 98   | P

tried this

awk '!x[$1]++' file <-- is removing whole line
 
Old 11-27-2018, 06:58 PM   #2
berndbausch
Senior Member
 
Registered: Nov 2013
Location: Tokyo
Distribution: Redhat/Centos, Ubuntu, Raspbian, Fedora, Alpine, Cirros, OpenSuse/SLES
Posts: 3,472

Rep: Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916
You don't provide a very thorough description of your task, so that I have to make a few assumptions.

1. Input: Entries are collated, perhaps also sorted according to col1. That is, all A1 entries are kept together, all B1 entries etc.
2. Output: Essentially identical to the input, except that each col1 values appears only once.

In this case, I would write an awk program that checks whether the value in col1 has changed. When it detects a change, it prints col1. Otherwise, it doesn't, but prints all other fields.

Here is a possible fragment:
Code:
$1 != previous_col1 { printf $1 }             # value in col1 changed
                    { previous_col1 = $1      # remember current col1 value
                      for (col=2;col<=$NF;col++)  # print remaining columns
                          printf $col " " 
                    }
Disclaimers: I am sure there are more elegant ways to solve the problem. This is just a suggestion and hasn't been tested. I leave the pretty formatting as an exercise for the reader.

EDIT: Another solution is using the sub() function to replace the A1, B1 etc by a string of blanks. This way, you don't have to worry about re-creating the pretty formatting.

Last edited by berndbausch; 11-27-2018 at 08:00 PM.
 
Old 11-27-2018, 08:27 PM   #3
maddyfreaks
Member
 
Registered: May 2011
Posts: 85

Original Poster
Rep: Reputation: 0
Sorry for missing the detail info

I have a file which was provided , Col1/Field 1 will always have duplicate data the rest of the fields may/may not but am not worried of other columns, all I need is if there is duplicate data it need to be printed with empty space and the values of Field 1 will be ordered so no values repeats further down the rows.

Hope this is clear.

I tried your code - but I do see errors. - can you let me know where I made mistake

$ cat /tmp/A1|awk '$1 != previous_col1 { printf $1 } { previous_col1 = $1 for (col=2;col<=$NF;col++) printf $col " " }'
awk: syntax error at source line 1
context is
$1 != previous_col1 { printf $1 } { previous_col1 = $1 >>> for <<< (col=2;col<=$NF;col++) printf $col " " }
awk: illegal statement at source line 1
 
Old 11-27-2018, 08:48 PM   #4
AwesomeMachine
LQ Guru
 
Registered: Jan 2005
Location: USA and Italy
Distribution: Debian testing/sid; OpenSuSE; Fedora; Mint
Posts: 5,513

Rep: Reputation: 1004Reputation: 1004Reputation: 1004Reputation: 1004Reputation: 1004Reputation: 1004Reputation: 1004Reputation: 1004
I have to say there is excellent documentation on awk. Is this homework?
 
Old 11-27-2018, 09:26 PM   #5
maddyfreaks
Member
 
Registered: May 2011
Posts: 85

Original Poster
Rep: Reputation: 0
no homework
writing a script struck at the end/final part.

tried to do my best as said am a new bee so is asking for help on how to achieve
 
Old 11-27-2018, 10:32 PM   #6
berndbausch
Senior Member
 
Registered: Nov 2013
Location: Tokyo
Distribution: Redhat/Centos, Ubuntu, Raspbian, Fedora, Alpine, Cirros, OpenSuse/SLES
Posts: 3,472

Rep: Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916
Quote:
Originally Posted by maddyfreaks View Post
awk: syntax error at source line 1
context is
$1 != previous_col1 { printf $1 } { previous_col1 = $1 >>> for <<< (col=2;col<=$NF;col++) printf $col " " }
awk: illegal statement at source line 1
The error message does its best to mark the location of the error. The for statement must either be on a separate line or separated by a semicolon.

I agree that the awk user guide is pretty good, and that there are many tutorials out there that help you come up to this level of awk programming. Its worthwhile investing a few hours to learn this tool.
 
Old 11-27-2018, 10:33 PM   #7
Turbocapitalist
Senior Member
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 4,173
Blog Entries: 3

Rep: Reputation: 2064Reputation: 2064Reputation: 2064Reputation: 2064Reputation: 2064Reputation: 2064Reputation: 2064Reputation: 2064Reputation: 2064Reputation: 2064Reputation: 2064
It would help if you were to use [code] [/code] tags when posting scripts. There was an extraneous dollar sign changing how the NF field was being used in the for loop, and a missing output field separator:

Code:
#!/usr/bin/awk -f

$1 != previous_col1 {
        printf $1
}
{ 
        previous_col1 = $1
        printf OFS
        for (col=2;col<=NF;col++) {
                printf $col OFS
        }
        printf ("\n")
}
Please look at AWK's manual page and find the many mentions of NF and how it can be used as an indirect reference (or not).
 
Old 11-28-2018, 12:22 AM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,719

Rep: Reputation: 3034Reputation: 3034Reputation: 3034Reputation: 3034Reputation: 3034Reputation: 3034Reputation: 3034Reputation: 3034Reputation: 3034Reputation: 3034Reputation: 3034
You can play with how to fix up the column alignments, but you can simply do:
Code:
awk '{if($1 == prev)$1 = "";else prev = $1}1' file
 
2 members found this post helpful.
Old 11-28-2018, 12:55 AM   #9
berndbausch
Senior Member
 
Registered: Nov 2013
Location: Tokyo
Distribution: Redhat/Centos, Ubuntu, Raspbian, Fedora, Alpine, Cirros, OpenSuse/SLES
Posts: 3,472

Rep: Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916
Quote:
Originally Posted by Turbocapitalist View Post
There was an extraneous dollar sign changing how the NF field was being used in the for loop
which I added to make the task a little more interesting. Thanks for spotting it.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] sed inside awk or awk inside awk maddyfreaks Linux - Newbie 4 06-29-2016 01:10 PM
[SOLVED] Once again... awk.. awk... awk shivaa Linux - Newbie 13 12-31-2012 04:56 AM
awk question on handling *.CSV "text fields" in awk jschiwal Programming 8 05-27-2010 06:23 AM
awk , I need help for awk, just a display function mcandy General 1 12-15-2008 12:21 PM
Some comments on awk and awk scripts makyo Programming 4 03-02-2008 05:39 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 11:44 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration