LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 07-08-2005, 01:02 PM   #1
webshark
LQ Newbie
 
Registered: Sep 2004
Distribution: Slackware 10.1
Posts: 23

Rep: Reputation: 15
Need REG EXP Help to remove line breaks


I have a very long file that I need to edit. If I do this by hand it will take days.

Here's an example of the pattern its in now......

80000 55 10524 8524 7261 6315 4868 3883 3628 3590 3553
(MC 0#> 25710)
80001 55 10116 8195 6980 6070 4679 3732 3487 3452 3415
(MC 0#> 24750)
80002 55 10116 8195 6980 6070 4679 3732 3487 3452 3415
(MC 0#> 24750)

I need a REG EXP or script or something that will move the "(MC 0#> 2#####)" lines to the preceeding line so that the end result looks like this......

80000 55 10524 8524 7261 6315 4868 3883 3628 3590 3553 (MC 0#> 25710)
80001 55 10116 8195 6980 6070 4679 3732 3487 3452 3415 (MC 0#> 24750)
80002 55 10116 8195 6980 6070 4679 3732 3487 3452 3415 (MC 0#> 24750)

Any advice or tips would help alot. I'm using Slackware 10.1 and I do have basic experience with the typical editors. I'm hoping that there is a REG EXP that will do this.

Thanks-
 
Old 07-08-2005, 01:36 PM   #2
Chrax
Member
 
Registered: Apr 2004
Distribution: Dapper
Posts: 167

Rep: Reputation: 31
I think this should do it.

Code:
perl -pi -e 's/\n\(/ (/gs' filename
What this does is turns any newline followed by a parenthesis into a space followed by a parenthesis.

-pi modifies the file in place.
 
Old 07-08-2005, 01:45 PM   #3
webshark
LQ Newbie
 
Registered: Sep 2004
Distribution: Slackware 10.1
Posts: 23

Original Poster
Rep: Reputation: 15
Thanks for the quick response. It didn't work. Here's what happened.....

dwilson@webshark:~$ perl -pi -e 's/\n (/ (/gs' sanrio070805.txt

Unmatched ( in regex; marked by <-- HERE in m/\n ( <-- HERE / at -e line 1.


the file starts off like this and then repeats over and over......


80013 55 10116 8195 6980 6070 4679 3732 3487 3452 3415
(MC 0#> 24710)
80014 55 10116 8195 6980 6070 4679 3732 3487 3452 3415
(MC 0#> 24750)
80015 55 10116 8195 6980 6070 4679 3732 3487 3452 3415
(MC 0#> 24750)


Any Ideas?

Thanks again!!! :-)
 
Old 07-08-2005, 01:55 PM   #4
Chrax
Member
 
Registered: Apr 2004
Distribution: Dapper
Posts: 167

Rep: Reputation: 31
It's because you replaced a backslash with a space for some reason. After the \n, it should say \(, not " (".

So copy this exactly:
Code:
perl -pi -e 's/\n\(/ (/gs' sanrio070805.txt
 
Old 07-08-2005, 02:14 PM   #5
webshark
LQ Newbie
 
Registered: Sep 2004
Distribution: Slackware 10.1
Posts: 23

Original Poster
Rep: Reputation: 15
THANK YOU!!! THANK YOU!!! THANK YOU!!!

Now, If I may, I have one more question......

Would that same line work for this?........

90000 55 5292 5292 5292 4762 2717 2070 1667 1667 1667
(MC 0#> 11720)(MC 100#> 14565)(MC 150#> 16910)(MC 200#> 19750)
(MC 250#> 21600)(MC 300#> 25940)(MC 400#> 29300)(MC 500#> 32485)
90001 55 5292 5292 5292 4762 2717 2070 1667 1667 1667
(MC 0#> 11720)(MC 100#> 14565)(MC 150#> 16910)(MC 200#> 19750)
(MC 250#> 21600)(MC 300#> 25940)(MC 400#> 29300)(MC 500#> 32485)
90002 55 5292 5292 5292 4762 2717 2070 1667 1667 1667
(MC 0#> 11720)(MC 100#> 14565)(MC 150#> 16910)(MC 200#> 19750)
(MC 250#> 21600)(MC 300#> 25940)(MC 400#> 29300)(MC 500#> 32485)


....Where the end result would look like this.........

90000 55 5292 5292 5292 4762 2717 2070 1667 1667 1667 (MC 0#> 11720)(MC 100#> 14565)(MC 150#> 16910)(MC 200#> 19750) (MC 250#> 21600)(MC 300#> 25940)(MC 400#> 29300)(MC 500#> 32485)
90001 55 5292 5292 5292 4762 2717 2070 1667 1667 1667 (MC 0#> 11720)(MC 100#> 14565)(MC 150#> 16910)(MC 200#> 19750) (MC 250#> 21600)(MC 300#> 25940)(MC 400#> 29300)(MC 500#> 32485)
90002 55 5292 5292 5292 4762 2717 2070 1667 1667 1667 (MC 0#> 11720)(MC 100#> 14565)(MC 150#> 16910)(MC 200#> 19750) (MC 250#> 21600)(MC 300#> 25940)(MC 400#> 29300)(MC 500#> 32485)

Thank You Again So Much!!!!!!
 
Old 07-08-2005, 02:20 PM   #6
Chrax
Member
 
Registered: Apr 2004
Distribution: Dapper
Posts: 167

Rep: Reputation: 31
Yes. That should be exactly what it does.
 
Old 07-08-2005, 02:24 PM   #7
webshark
LQ Newbie
 
Registered: Sep 2004
Distribution: Slackware 10.1
Posts: 23

Original Poster
Rep: Reputation: 15
I hate to be a pest.........

Upon a closer look at the file I found it only did it down to about the 26th line and thats it.

Any ideas why this would be the case?

Thanks-
 
Old 07-08-2005, 02:28 PM   #8
Chrax
Member
 
Registered: Apr 2004
Distribution: Dapper
Posts: 167

Rep: Reputation: 31
Don't worry about it.

Can you put the file contents here? I need to see if there's something different, or if that's the last line, or what.
 
Old 07-08-2005, 02:51 PM   #9
webshark
LQ Newbie
 
Registered: Sep 2004
Distribution: Slackware 10.1
Posts: 23

Original Poster
Rep: Reputation: 15
It about 13000 lines. this is what it looks like at the point where it stopped....



80011 55 10116 8195 6980 6070 4679 3732 3487 3452 3415 (MC 0#> 24750)

80012 55 10116 8195 6980 6070 4679 3732 3487 3452 3415 (MC 0#> 24750)

80013 55 10116 8195 6980 6070 4679 3732 3487 3452 3415
(MC 0#> 24750)
80014 55 10116 8195 6980 6070 4679 3732 3487 3452 3415
(MC 0#> 24750)
 
Old 07-08-2005, 03:03 PM   #10
Chrax
Member
 
Registered: Apr 2004
Distribution: Dapper
Posts: 167

Rep: Reputation: 31
That's peculiar. Could you email me a copy of the file, and I'll play around with it?

Email: effigies [at] gmail [dot] com
 
Old 07-08-2005, 03:15 PM   #11
webshark
LQ Newbie
 
Registered: Sep 2004
Distribution: Slackware 10.1
Posts: 23

Original Poster
Rep: Reputation: 15
Thanks......mail is on its way.
 
Old 07-08-2005, 03:28 PM   #12
webshark
LQ Newbie
 
Registered: Sep 2004
Distribution: Slackware 10.1
Posts: 23

Original Poster
Rep: Reputation: 15
Would this make it easier?

I found that the the end result is to be a CSV file.

The main issue I'm having is getting the short line to be added to the end of the previous line....


80040,55,10116,8195,6980,6070,4679,3732,3487,3452,3415,24750
80041,55,10116,8195,6980,6070,4679,3732,3487,3452,3415,24750
80042,55,10116,8195,6980,6070,4679,3732,3487,3452,3415,24750
80043,55,10116,8195,6980,6070,4679,3732,3487,3452,3415
24750
80044,55,10116,8195,6980,6070,4679,3732,3487,3452,3415
24750
80045,55,10116,8195,6980,6070,4679,3732,3487,3452,3415
24750
80046,55,10116,8195,6980,6070,4679,3732,3487,3452,3415
24750
80047,55,10116,8195,6980,6070,4679,3732,3487,3452,3415
24750



Thanks Again! :-)
 
Old 07-08-2005, 03:40 PM   #13
Chrax
Member
 
Registered: Apr 2004
Distribution: Dapper
Posts: 167

Rep: Reputation: 31
Ok, I'm having some serious troubles. That line I gave you in the first place isn't working, and I have no idea why.

Now for this problem, I would say

Code:
perl -ip -e 's/\n(\d+)\n/,$1\n/sg' filename
For some reason I can't test this out on my own, so I hope this works.
 
Old 07-08-2005, 03:52 PM   #14
Chrax
Member
 
Registered: Apr 2004
Distribution: Dapper
Posts: 167

Rep: Reputation: 31
Okay, assuming you've got some memory to spare, these definitely work.
For the earlier problem:

Code:
perl -e 'local($/);$file = shift @ARGV; open IN, $file; $contents = <IN>; close IN; $contents =~ s/\n\(/ (/gs; open OUT, ">$file";print OUT $contents;' filename
For the latest:

Code:
perl -e 'local($/);$file = shift @ARGV; open IN, $file; $contents = <IN>; close IN; $contents =~ s/([\d,]+)\n(\d+)\n/$1,$2\n/g; open OUT, ">$file";print OUT $contents;' filename
Now, one thing I noticed, when I got these files, they were in DOS format. My scripts don't account for that. But it's pretty simple to change them to do so. Just put a \r in front of every \n you see in them.
 
Old 07-08-2005, 03:56 PM   #15
webshark
LQ Newbie
 
Registered: Sep 2004
Distribution: Slackware 10.1
Posts: 23

Original Poster
Rep: Reputation: 15
It worked for only approx 44 lines.

Is their a way to grab the lines that did get changed and put them into another file
and run the line again and repeat until there is nothing left in the old file and everything
is correct in the new file.

I don't know, maybe the use of a pipe "|" command or something.


Ideas? Thought? Comments?

Thanks-
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Clarification needed in this reg exp/awk code mselvam Programming 3 07-09-2005 05:26 PM
need clarification in this reg exp/awk code mselvam Linux - General 1 07-08-2005 03:57 PM
Lost Reg Exp editor button in Kmail linuxbeliever Linux - Software 0 05-03-2005 10:56 PM
prob with reg exp rajatgarg Programming 3 05-28-2004 09:21 AM
Reg Exp ugenn Programming 4 09-19-2002 12:01 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:31 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration