LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   AWK - to find a string and save the third sector after (https://www.linuxquestions.org/questions/linux-newbie-8/awk-to-find-a-string-and-save-the-third-sector-after-941964/)

medirecpr 04-26-2012 05:54 PM

AWK - to find a string and save the third sector after
 
Hello all,

I am trying to write a bash script that can...

find a specific string. I am using:

awk -v RS="~ST" -v FS="*" '{ print $3}'

I run this against a file that has a few "~ST". However, the first line of my output is the first letters of teh file, which do not have an "~ST".

file sample:

ISA*00*<----Sectors---->~ST*837*030702343*<------MOre SEctors------->~CLM*7868*XXX***<-------More SEctors---->~SE*40*030702343~(more "~ST<------>~SE----->" sections)

output sample:

" " ->emtpy
030702343
030702344
030702345

If i print $1 $2 $3:

ISA00
837030702343
837030702344
837030702345

I see that it is grabbing the beggining of the file??? What ????

The second part uses another AWK:

awk -v RS="~CLM" -v FS="*" '{ print $2}' SSS_SELECTO_10092.x12 (which also brings the beggining of the file)

output sample:

00
7868
7869
7870


My plan is to use the output, without the first record. The second part of the process should accomplish:

sed -i 's/030702343/030707868/g' so it updates all transactions or ST<_>SE.

Why does the awk return that first record?

flamelord 04-26-2012 07:31 PM

awk is printing the first record, because you didn't tell it not to.

If you want to skip the first record you need to do something like

Code:

awk -v RS="~ST" -v FS="*" 'NR > 1 { print $3}'
that says effectively "print the third field, but only if the record number (NR) is greater than 1"

grail 04-27-2012 12:06 AM

I am not sure why you are confused?

Easy enough to look at your examples:
Code:

awk -v RS="~ST" -v FS="*" '{ print $3}'
This will make the records look like:
Code:

ISA*00*<----Sectors---->
*837*030702343*<------MOre SEctors------->~CLM*7868*XXX***<-------More SEctors---->~SE*40*030702343~(more "
<------>~SE----->" sections)

So the third fields are:
Code:

<----Sectors---->
030702343
#note there is a line here but it is blank as only a single field

Then for example 2:
Code:

awk -v RS="~CLM" -v FS="*" '{ print $2}' SSS_SELECTO_10092.x12
This will make the records look like:
Code:

ISA*00*<----Sectors---->~ST*837*030702343*<------MOre SEctors------->
*7868*XXX***<-------More SEctors---->~SE*40*030702343~(more "~ST<------>~SE----->" sections)

And so second fields are:
Code:

00
7868

You are the one indicating what a record should look like. So where is your confusion?

David the H. 04-27-2012 08:01 AM

It seems to me that you are forgetting that RS is set to newline by default, and if you change it to something else, then awk ignores the line breaks and only divides the text based on the set string, as grail demonstrated.

Perhaps you just need to re-include the newline in RS to make it do what you want.

Code:

RS="(~ST|\n)"

And please use [code][/code] tags around your code and data, to preserve formatting and to improve readability. Please do not use quote tags, colors, or other fancy formatting.

medirecpr 05-01-2012 03:07 PM

Thanks David, will use the tags next time!
 
Quote:

Originally Posted by David the H. (Post 4664469)
It seems to me that you are forgetting that RS is set to newline by default, and if you change it to something else, then awk ignores the line breaks and only divides the text based on the set string, as grail demonstrated.

Perhaps you just need to re-include the newline in RS to make it do what you want.

Code:

RS="(~ST|\n)"

And please use [code][/code] tags around your code and data, to preserve formatting and to improve readability. Please do not use quote tags, colors, or other fancy formatting.

Thanks david, this helped.


All times are GMT -5. The time now is 04:23 PM.