LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Adding lines to each block in a multicolumn text file (https://www.linuxquestions.org/questions/linux-newbie-8/adding-lines-to-each-block-in-a-multicolumn-text-file-4175464674/)

stevanity 06-04-2013 09:09 AM

Adding lines to each block in a multicolumn text file
 
Hello!
I have a text file with lots of addresses. I need to add a certain line on top of each address. My file looks like this now:

Code:

g51/b18468 Postgg On 30/05/2013      N51/b39897 Postgg On 30/05/2013      LR51/b23428 Postgg On 30/05/2013
Rgv. XXXXX                          Mr. bBnMbNNbN.M                      Bro. bBRbgbM .S,
KbgbNg bSSgMBLY Og gOg,              LOT 92, KbMPUNg                      gRbgg gOMMUNnTY ggNTRg,
52, gLN. TnMUR,                      BbTU 4, 43950                        4, LORONg SS 23/6g,
POST BOX 20, 43007 KbgbNg            SUNgbn PgLnK                        47400 PgTbLnNg gbYb,
MbLbYSnb                            SgLbNgUR                            SgLbNgOR, W.MbLbYSnb
                                    MbLbYSnb                            MbLbYSnb


ML51/b13179 Postgg On 30/05/2013    W51/b41363 Postgg On 30/05/2013      Lg51/b29053 Postgg On 30/05/2013
Bro. bBRbgbM .V                      Bro. bLbPPbN                        Pbstor. bLgRgg PRbgbSbM
NO:18,PgRSnbRbN                      NO.60, LORONg SgRbn PgRMbn 9,        1-2-2, MggbN nNTbN bPbRTMgNT,
RnSgbg .7                            TbMbN SgRbn PgRMbn,                  gbLbN 1/21g Ogg gbLbN gOMBbK
TbMbN RnSgbg                        34300, BbTbN SgRbn,                  53000 KUbLb LUMPUR
30100 nPOg, PgRbK                    PgRbK                                MbLbYSnb
MbLbYSnb                            MbLbYSnb


LR510b13172 Postgg On 30/05/2013    ML51/b9877 Postgg On 30/05/2013      LR51/b9905 Postgg On 30/05/2013
Sns. bLLbMbg (b) NgSbM              Bro. bLVnN bNbNg .b                  Mrs. bNnTgb VngTOR
NO:44,gbLbN gbMbn                    NO:3,gbLbN TgRbTbn 3                7,gbLbN PbRn
TbMbN TbPbg gbYb                    TbMbN SbRn,TgRbTbn                  PbRn gbRggNS
35000 TbPbg                          44000 KUbLb KUBU BgbRU              nPOg,30100
PgRbK,                              SgLbNgOR                            PgRbK
MbLbYSnb                            MbLbYSnb                            MbLbYSnb


ML51/b13180 Postgg On 30/05/2013    ML51/b13203 Postgg On 30/05/2013    g51/b9942 Postgg On 30/05/2013
Bro. bNTONY                          Bro. bNTONYSWbMY                    Pbstor. bROKnbNbTgbN
NO:399,LORONg KgNbVn 3/2b            LbgbNg gUSUN gURnbM                  NO:M4 g/7
TbMbN KgNbRn                        42700 BbNTnNg                        gbLbN PbNgbN nNgbg 4/1b
09000 KULnM                          SgLbNgOR g.g                        PbNgbN nNgbg
Kggbg                                MbLbYSnb                            55100,KUbLb LUMPUR
MbLbYSnb                                                                  MbLbYSnb

I want it to look like this:
Code:

My New Line                          My New Line                          My New Line
g51/b18468 Postgg On 30/05/2013      N51/b39897 Postgg On 30/05/2013      LR51/b23428 Postgg On 30/05/2013
Rgv. XXXXX                          Mr. bBnMbNNbN.M                      Bro. bBRbgbM .S,
KbgbNg bSSgMBLY Og gOg,              LOT 92, KbMPUNg                      gRbgg gOMMUNnTY ggNTRg,
52, gLN. TnMUR,                      BbTU 4, 43950                        4, LORONg SS 23/6g,
POST BOX 20, 43007 KbgbNg            SUNgbn PgLnK                        47400 PgTbLnNg gbYb,
MbLbYSnb                            SgLbNgUR                            SgLbNgOR, W.MbLbYSnb
                                    MbLbYSnb                            MbLbYSnb

for all addresses ofc while maintaining same formatting. I dont know how to achieve this. Can you guys help me out with this?

Your help is much appreciated! :)

Thank you.

Beryllos 06-04-2013 10:43 AM

In which word processor or programming language are you planning to accomplish this? What have you tried so far?

If every address field has the same height and width, including the blank padding below and to the right of each address, you don't even have to do anything sophisticated like searching for patterns. You could just write a word processor script (or macro) to repeatedly count lines and insert the new text. The new text would not be
Code:

My New Line
but rather
Code:

My New Line                          My New Line                          My New Line

Beryllos 06-04-2013 10:47 AM

By the way, I have suggested a way to handle this as a one-time problem. If you plan to maintain this address list and modify it in the future, you really should rearrange it into one column or some other structure or database that will make your future work easier.

eklavya 06-04-2013 11:04 AM

I can give you an idea.
When more than two blank lines are appeared, Replace last blank line with your new line.

And in the same line add your new line again after certain charaters.

David the H. 06-04-2013 03:12 PM

This works with the exact text above. You just have to get the line formatting right.
Code:

awk 'BEGIN{ RS=ORS="\n\n\n"; str="My New Line" ; sp="                          " } { $0=str sp str sp str"\n" $0 ; print }'  input.txt
The input and output separators have been set to three contiguous newlines, and each record is simply prepended with the desired string and then printed. I used variables for the string and padding spaces just to compact things a bit.

It could probably be done somewhat cleaner with printf instead, but this works well enough.

Finally, I agree with Beryllos. Lists of database-style entries are easier to manipulate if each record is kept separate.

Beryllos 06-04-2013 04:04 PM

slightly off topic, but fun...

stevanity, It appears that your sample text is a substitution cipher. Before we all have some fun cracking it, I need to ask you if it is important for reasons of privacy or security to keep the names and addresses secret. If so, you should delete them.

unclesamcrazy 06-05-2013 02:55 AM

@David the H.

Excellent !!!

Can you please explain, what did you exactly do to achieve this using awk?

I have understood
Quote:

str="My New Line" ; sp=" " }
both are equal to 37 characters, exact space between two addresses but I could not understand $0. what does it do?
As well as I could not understand the functionality of RS=ORS="\n\n\n"; & "\n" $0 ;

Can you please explain this?

Thanks !!! :)

David the H. 06-06-2013 01:06 PM

Well, a full explanation really requires understanding something of how awk works. Check out the links below.

But in brief, awk divides the input into records, and then further subdivides the records into fields. By default a record is a line, and a field is a word, but this can be changed. RS is the input record separator variable, which I set to match three consecutive newlines. ORS is the output record separator, which needs to be set to the same as the input if you want to keep the same formatting.

$0 refers to the current record as a whole. So the command just re-sets it to be equal to the new line plus itself, then prints it.

Here are a few useful awk references:
http://www.grymoire.com/Unix/Awk.html
http://www.gnu.org/software/gawk/man...ode/index.html
http://www.pement.org/awk/awk1line.txt
http://www.catonmat.net/series/awk-one-liners-explained

karim.ouda 06-12-2014 01:51 AM

It doesn't look real data but if Postgg On and date format is common in first line of every record, it may work.
Code:

sed 's/\(.*\)Postgg On [0-9][0-9]\/[0-9][0-9]\(.*\)/My New Line                          My New Line                          My New Line\n&/' input.txt


All times are GMT -5. The time now is 12:43 PM.