AWK： how to replace particular field or sub-string?

cristalp · 06-12-2013, 04:20 AM

I want to only replace string of the last field of each line to string "B".

I did

Code:

awk '{gsub($NF, "B");print}' FILE

However, the first field in my file also contains "A", then it change it also to "B".

I tried then

Code:

awk '{sub($NF, "B");print}' FILE

It dose not replace anything in my file.

How could I do this properly? Thanks for your help.

syg00 · 06-12-2013, 04:23 AM

Try

Code:

$NF = "B"

and see if it helps.

grail · 06-12-2013, 06:33 AM

You may need to be clearer about what action you require.

1. syg00 is correct if you simply wish to make the last a B

2. Your current two examples are using the data stored in the last value and searching the entire line and then gsub replaces all occurrences and sub will replace the first

3. If you wish to alter the last value and change part of it to B then you need to tell the commands what to look for and place the field being looked at (NF in examples) after the change value, example:

Code:

gsub(/old_value/, "B", $NF)

cristalp · 06-13-2013, 09:43 AM

Quote:

Originally Posted by grail

You may need to be clearer about what action you require.

1. syg00 is correct if you simply wish to make the last a B

2. Your current two examples are using the data stored in the last value and searching the entire line and then gsub replaces all occurrences and sub will replace the first

3. If you wish to alter the last value and change part of it to B then you need to tell the commands what to look for and place the field being looked at (NF in examples) after the change value, example:

Code:

gsub(/old_value/, "B", $NF)

Thanks grail, I tried your code. Yes, it works. BUT, it changed the format of the line that has B at the last field. That means the number of spaces between each field are different after I change it. Is there any way to use gsub and keep the format at the same time?

grail · 06-13-2013, 10:29 AM

Unfortunately this is a side affect of setting the positional item to a new value. Perhaps if you gave us an example of what you are attempting we could help with a better solution?

danielbmartin · 06-13-2013, 03:11 PM

With this InFile ...

Code:

Apple1  Apple2   Apple3 Apple4
Pear1     Pear2  Pear3    Pear4
Cherry1 Cherry2 Cherry3  Cherry4

... this sed ...

Code:

sed 's/\(.* \)\([^:graph:].*\)/\1B/' $InFile >$OutFile

... produced this OutFile ...

Code:

Apple1  Apple2   Apple3 B
Pear1     Pear2  Pear3    B
Cherry1 Cherry2 Cherry3  B

Daniel B. Martin

David the H. · 06-14-2013, 01:43 PM

Quote:

Originally Posted by cristalp

Thanks grail, I tried your code. Yes, it works. BUT, it changed the format of the line that has B at the last field. That means the number of spaces between each field are different after I change it. Is there any way to use gsub and keep the format at the same time?

Yes, this is a side-effect of the way awk works. It only keeps the original spacing if the line remains unaltered (e.g. print $0). If any changes are made to the line, then the full effect of field-splitting takes over, and the output is printed with the value of the OFS (output field separator) variable between each field instead.

So your options are to either set OFS to the same character as FS (or whatever you want), or manually format the output in print/printf.

Code:

$ echo 'foo  bar   baz' | awk '{ print }'
foo  bar   baz

$ echo 'foo  bar   baz' | awk '{ $2="bam" ; print }'
foo bam baz

$ echo 'foo  bar   baz' | awk -v OFS='   ' '{ $2="bam" ; print }'
foo   bam   baz

syg00 · 06-14-2013, 06:27 PM

I ran into the (re)formatting issue when dealing with columnar data - with page headings. As the data came down from a mainframe and has to be sent back I changed my approach.
Leave the data you get the way the client sent it, and tack more field(s) on the end of lines.

Easy to do, within my control, and the customer agreed.
KISS, but YMMV.

grail · 06-15-2013, 03:23 AM

I would like to make a slight adjustment to David's information:

Quote:

If any changes are made to the line, then the full effect of field-splitting takes over, and the output is printed with the value of the OFS (output field separator) variable between each field instead.

This should be if any change is made to the pieces of the original line, as a change to $0 itself does not incur this result:

Code:

$ echo 'foo  bar   baz' | awk '{sub(/bar/,"bam");print}'

# as oppsoed to

$ echo 'foo  bar   baz' | awk '{sub(/bar/,"bam",$2);print}'

As the first example is using the complete line in tact, ie $0, any formatting is retained, but once you change a field ($2 in example), then OFS is introduced to the output