LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-25-2010, 01:10 PM   #1
sharky
Member
 
Registered: Oct 2002
Posts: 569

Rep: Reputation: 84
modify lines that are too long


I have a CDL netlist with 5630 lines. 512 of the lines are over 128 characters. The tool I am using to read in the CDL returns an error for each line over 128 characters.

If the line is too long I can fix it by adding a line continuation symbol, in this case a "/", somewhere prior to the 128th character then a line feed, obviously, and a "+" to the continuation.

example (pretend its a long line);
before;
this line is too long
after;
this line /
+ is too long

Part of the problem is that I can't use a constant point prior to the 128th character because I can't break up a term.

bad;
this line i /
+ s too long

If I can replace the last space before the 128th character with " / \n+ " on all lines that are over 128 characters then I'm golden. I'm not sure if I need to escape the + or not. If so then the substitution is " / \n\+ ". And if I use sed then I'll escape the \.

I'll be digging through awk and sed references but if someone has an answer then please save me the work. )
 
Old 05-25-2010, 03:06 PM   #2
sharky
Member
 
Registered: Oct 2002
Posts: 569

Original Poster
Rep: Reputation: 84
I've started a perl script but not making much progress. I open the file, split the lines by whitespace into an @list and somewhere in that array I'll append " /" to $list[#] and prepend "+ " to $list[#+1]. Then I need to write that array back to $line.
 
Old 05-25-2010, 03:56 PM   #3
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Yes, the idea is good. Moreover you don't need to split the line if there is a function which gives the position of blank spaces. Since I'm not clever with perl, here is a possible solution in awk:
Code:
length > 128 {
  string = $0
  p = match(string,/ /)
  
  while ( p ) {
    pre = p
    sub(/ /,"_",string)
    p = match(string,/ /)
    if ( p > 127 ) break
  }
  
  $0 = ( substr($0, 1, pre) "/\n +" substr($0, pre) )
}
1
Here the match function gives the index of the first character of the matching substring (in this case the position of the leftmost single blank space). Inside the loop the blank spaces are progressively substituted by another character and the position of the first blank is computed again. When the index is beyond the 127th character, the loop breaks and we retain the position of the previous blank space (stored in the variable 'pre').

After that we can simply rebuild the original line using substrings. This assumes that the lines cannot be over 256 character length. Another assumption is that every line has at least one space within the first 127 characters. If my assumptions are correct, maybe this is what you're looking for.
 
1 members found this post helpful.
Old 05-25-2010, 04:07 PM   #4
sharky
Member
 
Registered: Oct 2002
Posts: 569

Original Poster
Rep: Reputation: 84
Got this to work in perl. The elements I chose were arbitrary and worked for my particular case.

Quote:
open(CDL, "netlist.cdl");
while ($line = <CDL>)
{
if (length $line > 120)
{
@list = split(/ /, $line);
$list[6] = "$list[6]\n";
$list[7] = "+ $list[7]";
$line = "@list";

}
print $line
}
close CDL;
colucix's example looks like a better general solution.

Thanks,
 
Old 05-25-2010, 11:47 PM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
I know this is already solved, but colucix inspired me to try and come up with a solution irrelevant of length (ie perhaps greater than 256):
Code:
awk 'length > 128{for(i=int(length/125);i>0;i--){match(substr($0,0,i*125),/.* /);sub(substr($0,0,RLENGTH),"&/\n+")}}1' input
 
Old 05-26-2010, 03:39 PM   #6
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Code:
sub(substr($0,0,RLENGTH),"&/\n+")
Sorry grail, nice idea but it doesn't work in some cases. The reason clearly explained in the GNU awk user's guide (in the description of the sub function):
Quote:
if the regexp is not a regexp constant, it is converted into a string, and then the value of that string is treated as the regexp to match.
This means that if you have some special character in the substring, it will be treated with the regexp rules and doubtfully the substring will match. For example just a little plus sign can change the meaning of the expression and the string will not match anymore.

Moreover, I have some doubt about the decrement of the length of the substring in the match statement. This is for reasons I cannot explain right now (almost late in the night, here...) but a test demonstrates that there is a shift somewhere:
Code:
$ cat infile
--- --- 10--- --- 20--- --- 30--- --- 40--- --- 50--- --- 60--- --- 70--- --- 80--- --- 90--- ---100--- ---110--- ---120--- ---8-0--- ---140--- ---150--- ---160--- ---170--- ---180--- ---190--- ---200--- ---210--- ---220--- ---230--- ---240--- ---250--- -6---0--- ---270--- ---280--- ---290--- ---300--- ---310--- ---320--- ---330--- ---340--- ---350--- ---360--- ---370--- ---380--- ---390--- ---400--- ---410--- ---420--- ---430--- ---440--- ---450--- ---460--- ---470--- ---480--- ---490--- ---500--- ---510--- ---520
short line
$ awk '
> length > 128 {
>   for (i=int(length/125);i>0;i--){
>     match(substr($0,0,i*125),/.* /)
>     sub(substr($0,0,RLENGTH),"&/\n+")
>   }
> }1' infile | awk '{print length}'
125
122
132  <-- this is the too long piece
122
27
10   <-- this is the length of the short line
it should be related to the fact that the length of $0 changes upon each substitution. Indeed it requires further investigation.

Another little note (you will hate me... I know) is that the first character in a string is the character number 1, not 0 as it appears in the substring statements.

Hope you will not be disappointed for this. Actually I was intrigued by your solution and I just made some test to see it work.
 
Old 05-26-2010, 07:15 PM   #7
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Quote:
Hope you will not be disappointed for this.
Can't get better if I don't know my mistakes
Quote:
Another little note (you will hate me... I know) is that the first character in a string is the character number 1, not 0 as it appears in the substring statements.
I always forget this one keep thinking of it like arrays in C
Quote:
it should be related to the fact that the length of $0 changes upon each substitution
I did the reverse loop to allow for this fact that the length is changing and because adding the newline caused issues going forwards (will look into it further)

Quote:
This means that if you have some special character in the substring, it will be treated with the regexp rules and doubtfully the substring will match. For example just a little plus sign can change the meaning of the expression and the string will not match anymore.
And that bit just sux. Shows I had not read manual closely enough. Valuable info in this one for me that I will have to try and remember

Back to the drawing board
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Console does not wrap long lines z-vet Linux - Software 11 10-10-2013 04:05 AM
[SOLVED] Need help with sed to modify only lines of text meeting certain criteria kmkocot Programming 12 08-17-2009 11:50 AM
[SOLVED] DocBook to pdf - Automatically breaking long lines traene Linux - Software 1 01-19-2007 07:09 AM
UNIX shell script: split long command on multiple lines loopoo Linux - Newbie 2 10-23-2006 09:34 AM
bash does not wrap long lines correctly monkeyman2000 Linux - General 8 09-08-2004 09:30 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 07:28 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration