LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   awk does not print the value of the variable as I expected, please help (https://www.linuxquestions.org/questions/linux-newbie-8/awk-does-not-print-the-value-of-the-variable-as-i-expected-please-help-872851/)

jozelo 04-04-2011 07:07 AM

awk does not print the value of the variable as I expected, please help
 
Originally Posted by jozelo
I have this simple awk program but it not prints var_cab whe^E or ^S or ^M after he finds the second ^C

/^C/ { var_cab = substr($0,1,28); ent = 0 ; print "estoy en cabecera" var_cab;}
/^E/ { if ( ent ==0 ) var_fech = substr($0,4,8) ; ent ++;
print var_cab var_fech $0; }
/^S/ { print var_cab var_fech $0; }
/^M/ { print var_cab var_fech $0; }
END { print var_cab,ent}


the input file is many lines like this, every new block stats with ^CAB

CABES3000088888880000007EAMB
ENT20090706D060709-888 0028560000000012VALLE CA'ZULIA5501, 5502, 5498 y 5535
SAL201008250000134900000321V1202935-MU

SAL201011170001361000005245DES3008888777
5577
SAL201011240002224200008601V1202935-MU

SAL201011290001605400016054V1202935-MU

MEN20090701010026108100000020
MEN20090801020026096500000040
MEN20090901030026072900000066
MEN20091001040026044800000101
MEN20091101050025935600000135
CABES300030640069001EBSS
ENT20090324CU-L53-240309 0027839900000021ABSA 133225
SAL201006010004794800021040006,6900V1205439-MU

SAL201007010005631000024981006,5100V1205439-MU

SAL201010010000034400000199006,6800V1205439-MU

MEN20100101110024285000000400
MEN20100201120024211600000410
MEN20100301130024182100000417
MEN20100401140024153100000425
CABES300030640069005EBSS
ENT20080823CU-L50-230808 0026850400000030ABSA 129033
MEN20110101300012104000000942
MEN31/02/2011310012070900000951
CABES300030640069006EBSS
ENT20090120CU-L52-200109 0028987000000019AQUASTREM-G 17NR00000000010618
SAL201005010001311400005219006,4300V1205439-MU

SAL201006010000528300002086006,6900V1205439-MU

SAL201007010009972900046519006,5100V1205439-MU

MEN20100101130025965100000407
MEN20100201140025855500000417
MEN20100301150025775800000425
MEN20100401160025703100000434
MEN20100501170024350700000405
MEN20100601180023624400000455
MEN20100701190012744100000515
MEN20100801200012658400000573
MEN20100901210012522800000641
.... many more lines

Kenhelm 04-04-2011 08:30 AM

If the data was created in Windows there could be a carriage return character at the end of each line.
The first ^C line has 28 characters so substr($0,1,28) omits the carriage return.
The next ^C lines have only 24 characters so substr($0,1,28) includes the carriage return.

Carriage returns can be removed from the file with GNU sed
sed 's/\r//g'

or with
tr -d '\r'

David the H. 04-04-2011 09:19 AM

For that matter, awk can do it too. Add this line to the script to change dos endings to unix endings.
Code:

{ sub(/\r$/,""); }
Or as another option, try adding this to your script:
Code:

BEGIN{ RS="\n|\r\n"; }
This should allow it to handle both dos- and unix-style line endings, without actually converting them. At least it works for me in testing. :)

file filename.txt should tell you which kind of endings you have. If it says "with CRLF line terminators", then it's a dos-encoded file.

jozelo 04-04-2011 01:14 PM

Thanks a lot it work perfect with ...
 
Quote:

Originally Posted by David the H. (Post 4313380)
For that matter, awk can do it too. Add this line to the script to change dos endings to unix endings.
Code:

{ sub(/\r$/,""); }
Or as another option, try adding this to your script:
Code:

BEGIN{ RS="\n|\r\n"; }
This should allow it to handle both dos- and unix-style line endings, without actually converting them. At least it works for me in testing. :)

file filename.txt should tell you which kind of endings you have. If it says "with CRLF line terminators", then it's a dos-encoded file.

the BEGIN{ RS="\n|\r\n"; } to rcognize end of record

jozelo 04-04-2011 01:17 PM

Thanks a lot it woked perfectly with the BEGIN{ RS="\n|\r\n"; }


All times are GMT -5. The time now is 09:04 AM.