bash/sed/awk fill each line in text file with space to fixed length
hello all,
i need to reformat a text file to have the length of each line is fixed at 1968bytes and fill the line with space if the line length is less than 1968bytes. How to accomplish this using sed/awk/bash. my sample data is as follows; Code:
... Code:
... |
Code:
awk 'length <= 1968 { printf "%-1968s\n",$0 }' "file" |
Or:
Code:
awk '{printf "%-1968s\n",$0}' data Code:
(IFS=$'\n';printf "%-1968s\n" $(<data)) |
thx a lot ghostdog74 and radoulov for the solutions. all those is working for files that have CR/LF.
but my situation is like this; the 'data' file is currently a single line file (as on magnetic tape) which have no CR/LF characters, but i know the length for each records is 123bytes so i can get the above 'data' if i run; Code:
fold -w 123 data > DataTemp Code:
awk 'length <= 1968 { printf "%-1968s",$0 }' DataTemp Code:
(IFS=$'\n';printf "%-1968s" $(<DataTemp)) Code:
visually, i want to format this kind of data (no cr/lf in both input and output) Code:
the awk script format from this |
Like this?
GNU Awk: Code:
awk --re-interval '{printf "%-1968s",RT}' RS=".{123}" data |
that's it! bingo! thx a lot people.! :cool:
can u explain the command (parameters; RT, RS) and notation. wow! how d u guys mastering sed/awk? is there good tuts? tqvm. |
another scenario (but still related to the topic)
i have a text file (it's a CDR;Call Data Record actually) in this format but with no CR/LF character;
|---VOL 80bytes---| |---HDR1 80bytes---| |---HDR2 80bytes---| |-----Data_1 123bytes-----| |-----Data_2 123bytes-----| ... |-----Data_n 123bytes-----| |---EOF1 80bytes---| |---EOF2 80bytes---| |---HDR1 80bytes---| |---HDR2 80bytes---| |-----Data_1 123bytes-----| |-----Data_2 123bytes-----| ... |-----Data_n 123bytes-----| |---EOF1 80bytes---| |---EOF2 80bytes---| |---HDR1 80bytes---| |---HDR2 80bytes---| |-----Data_1 123bytes-----| |-----Data_2 123bytes-----| ... |-----Data_n 123bytes-----| |---EOF1 80bytes---| |---EOF2 80bytes---| my task is to split this file from HDR1 to EOF2 to become individual file, so in this example there will be 3 individual file. i'm trying to use only sed/awk/dd/bash for this task. Can anyone out there can help me? in addition i also want each section to become 1968bytes in length (fill with space like earlier post in this thread). tqvm in advance. |
If you post a sample from the real input data
and an example (with real data) of the desired output, it would be easier. |
ok radoulov,
here is my sample data; sample_data (2885bytes) Code:
VOL148M005 48BATDT 1HDR1TTFILE00 48M00500010001814300 07338 00000 000000 HDR2F0196801968 00 0000036253218576268899 01000071203222436000414000001 0 KLJDT7OBAT4C7I02200 0 000001225488200342969982 01000071203222837000014000005 0 TAG2C7O2MKPI7I02100 0 000403620157880122845822 01000071203222807000044000006 0 2MKPO7ODTAC7I 02100 0 000001462181140340251050 01000071203222843000009000008 0 SET4C7O6MSAI7I02100 0 000001623519100380704228 01000071203222833000020000009 0 BGIDT7O6MSAI7I02100 0 EOF1TTFILE00 48M00500010001814300 07338 00000 002000 EOF2F0196801968 00 HDR1TTFILE00 48M00500010001814300 07338 00000 000000 HDR2F0196801968 00 0000036253218576268899 01000071203222436000414000001 0 KLJDT7OBAT4C7I02200 0 000001225488200342969982 01000071203222837000014000005 0 TAG2C7O2MKPI7I02100 0 000403620157880122845822 01000071203222807000044000006 0 2MKPO7ODTAC7I 02100 0 000001462181140340251050 01000071203222843000009000008 0 SET4C7O6MSAI7I02100 0 000001623519100380704228 01000071203222833000020000009 0 BGIDT7O6MSAI7I02100 0 EOF1TTFILE00 48M00500010001814300 07338 00000 002000 EOF2F0196801968 00 HDR1TTFILE00 48M00500010001814300 07338 00000 000000 HDR2F0196801968 00 0000036253218576268899 01000071203222436000414000001 0 KLJDT7OBAT4C7I02200 0 000001225488200342969982 01000071203222837000014000005 0 TAG2C7O2MKPI7I02100 0 000403620157880122845822 01000071203222807000044000006 0 2MKPO7ODTAC7I 02100 0 000001462181140340251050 01000071203222843000009000008 0 SET4C7O6MSAI7I02100 0 000001623519100380704228 01000071203222833000020000009 0 BGIDT7O6MSAI7I02100 0 EOF1TTFILE00 48M00500010001814300 07338 00000 002000 EOF2F0196801968 00 |--VOL 80bytes--|--HDR1 80bytes--|--HDR2 80 bytes--|---Data_1 123bytes--|-Data_2 123bytes--|--Data_3 123bytes--|--Data_4 123bytes--|--Data_5 123bytes--|--EOF1 80 bytes--|--EOF2 bytes--|--HDR1 80bytes--|--HDR2 80 bytes--|---Data_1 123bytes--|-Data_2 123bytes--|--Data_3 123bytes--|--Data_4 123bytes--|--Data_5 123bytes--|--EOF1 80 bytes--|--EOF2 bytes--|--HDR1 80bytes--|--HDR2 80 bytes--|---Data_1 123bytes--|-Data_2 123bytes--|--Data_3 123bytes--|--Data_4 123bytes--|--Data_5 123bytes--|--EOF1 80 bytes--|--EOF2 bytes--| sample_out (5488bytes) Code:
VOL148M005 48BATDT 1 HDR1TTFILE00 48M00500010001814300 07338 00000 000000 HDR2F0196801968 00 0000036253218576268899 01000071203222436000414000001 0 KLJDT7OBAT4C7I02200 0 000001225488200342969982 01000071203222837000014000005 0 TAG2C7O2MKPI7I02100 0 000403620157880122845822 01000071203222807000044000006 0 2MKPO7ODTAC7I 02100 0 000001462181140340251050 01000071203222843000009000008 0 SET4C7O6MSAI7I02100 0 000001623519100380704228 01000071203222833000020000009 0 BGIDT7O6MSAI7I02100 0 EOF1TTFILE00 48M00500010001814300 07338 00000 002000 EOF2F0196801968 00 HDR1TTFILE00 48M00500010001814300 07338 00000 000000 HDR2F0196801968 00 0000036253218576268899 01000071203222436000414000001 0 KLJDT7OBAT4C7I02200 0 000001225488200342969982 01000071203222837000014000005 0 TAG2C7O2MKPI7I02100 0 000403620157880122845822 01000071203222807000044000006 0 2MKPO7ODTAC7I 02100 0 000001462181140340251050 01000071203222843000009000008 0 SET4C7O6MSAI7I02100 0 000001623519100380704228 01000071203222833000020000009 0 BGIDT7O6MSAI7I02100 0 EOF1TTFILE00 48M00500010001814300 07338 00000 002000 EOF2F0196801968 00 HDR1TTFILE00 48M00500010001814300 07338 00000 000000 HDR2F0196801968 00 0000036253218576268899 01000071203222436000414000001 0 KLJDT7OBAT4C7I02200 0 000001225488200342969982 01000071203222837000014000005 0 TAG2C7O2MKPI7I02100 0 000403620157880122845822 01000071203222807000044000006 0 2MKPO7ODTAC7I 02100 0 000001462181140340251050 01000071203222843000009000008 0 SET4C7O6MSAI7I02100 0 000001623519100380704228 01000071203222833000020000009 0 BGIDT7O6MSAI7I02100 0 EOF1TTFILE00 48M00500010001814300 07338 00000 002000 EOF2F0196801968 00 |---VOL 196bytes---|---HDR1 196bytes---|---HDR2 196bytes---|---Data_1 196bytes---|---Data_2 196bytes---|---Data_3 196bytes---|---Data_4 196bytes---|---Data_5 196bytes---|---EOF1 196bytes---|---EOF2 196bytes---|---HDR1 196bytes---|---HDR2 196bytes---|---Data_1 196bytes---|---Data_2 196bytes---|---Data_3 196bytes---|---Data_4 196bytes---|---Data_5 196bytes---|---EOF1 196bytes---|---EOF2 196bytes---|---HDR1 196bytes---|---HDR2 196bytes---|---Data_1 196bytes---|---Data_2 196bytes---|---Data_3 196bytes---|---Data_4 196bytes---|---Data_5 196bytes---|---EOF1 196bytes---|---EOF2 196bytes---| since there is a restriction of post size in this forum so for the sample out i fixed the length at 196bytes instead of 1968bytes as discussed before. tq. |
You may need to adjust the fieldwidths:
Code:
awk --re-interval '/HDR/ { |
thanks a lot radoulov. your solution works for the sample file, but what if the data block count is variable and not fixed to only five in each section (from HDR1 to EOF2).. i may have more or less than five data in each section. in the real file (which the size was 7MB) i have almost 32000 data block in each section.
tq. |
I understand.
If the number of 80 bytes records is fixed (i.e. always four: HDR1, HDR2, EOF1 and EOF2) and the number of 123 bytes records varies, the FIELDWIDTHS could be calculated: (length-320)/123, in my example: (length-240)/123 as the EOF2 section is part of the RS/RT. |
All times are GMT -5. The time now is 05:14 AM. |