ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a CDL netlist with 5630 lines. 512 of the lines are over 128 characters. The tool I am using to read in the CDL returns an error for each line over 128 characters.
If the line is too long I can fix it by adding a line continuation symbol, in this case a "/", somewhere prior to the 128th character then a line feed, obviously, and a "+" to the continuation.
example (pretend its a long line);
before;
this line is too long
after;
this line /
+ is too long
Part of the problem is that I can't use a constant point prior to the 128th character because I can't break up a term.
bad;
this line i /
+ s too long
If I can replace the last space before the 128th character with " / \n+ " on all lines that are over 128 characters then I'm golden. I'm not sure if I need to escape the + or not. If so then the substitution is " / \n\+ ". And if I use sed then I'll escape the \.
I'll be digging through awk and sed references but if someone has an answer then please save me the work. )
I've started a perl script but not making much progress. I open the file, split the lines by whitespace into an @list and somewhere in that array I'll append " /" to $list[#] and prepend "+ " to $list[#+1]. Then I need to write that array back to $line.
Yes, the idea is good. Moreover you don't need to split the line if there is a function which gives the position of blank spaces. Since I'm not clever with perl, here is a possible solution in awk:
Code:
length > 128 {
string = $0
p = match(string,/ /)
while ( p ) {
pre = p
sub(/ /,"_",string)
p = match(string,/ /)
if ( p > 127 ) break
}
$0 = ( substr($0, 1, pre) "/\n +" substr($0, pre) )
}
1
Here the match function gives the index of the first character of the matching substring (in this case the position of the leftmost single blank space). Inside the loop the blank spaces are progressively substituted by another character and the position of the first blank is computed again. When the index is beyond the 127th character, the loop breaks and we retain the position of the previous blank space (stored in the variable 'pre').
After that we can simply rebuild the original line using substrings. This assumes that the lines cannot be over 256 character length. Another assumption is that every line has at least one space within the first 127 characters. If my assumptions are correct, maybe this is what you're looking for.
Sorry grail, nice idea but it doesn't work in some cases. The reason clearly explained in the GNU awk user's guide (in the description of the sub function):
Quote:
if the regexp is not a regexp constant, it is converted into a string, and then the value of that string is treated as the regexp to match.
This means that if you have some special character in the substring, it will be treated with the regexp rules and doubtfully the substring will match. For example just a little plus sign can change the meaning of the expression and the string will not match anymore.
Moreover, I have some doubt about the decrement of the length of the substring in the match statement. This is for reasons I cannot explain right now (almost late in the night, here...) but a test demonstrates that there is a shift somewhere:
Code:
$ cat infile
--- --- 10--- --- 20--- --- 30--- --- 40--- --- 50--- --- 60--- --- 70--- --- 80--- --- 90--- ---100--- ---110--- ---120--- ---8-0--- ---140--- ---150--- ---160--- ---170--- ---180--- ---190--- ---200--- ---210--- ---220--- ---230--- ---240--- ---250--- -6---0--- ---270--- ---280--- ---290--- ---300--- ---310--- ---320--- ---330--- ---340--- ---350--- ---360--- ---370--- ---380--- ---390--- ---400--- ---410--- ---420--- ---430--- ---440--- ---450--- ---460--- ---470--- ---480--- ---490--- ---500--- ---510--- ---520
short line
$ awk '
> length > 128 {
> for (i=int(length/125);i>0;i--){
> match(substr($0,0,i*125),/.* /)
> sub(substr($0,0,RLENGTH),"&/\n+")
> }
> }1' infile | awk '{print length}'
125
122
132 <-- this is the too long piece
122
27
10 <-- this is the length of the short line
it should be related to the fact that the length of $0 changes upon each substitution. Indeed it requires further investigation.
Another little note (you will hate me... I know) is that the first character in a string is the character number 1, not 0 as it appears in the substring statements.
Hope you will not be disappointed for this. Actually I was intrigued by your solution and I just made some test to see it work.
Another little note (you will hate me... I know) is that the first character in a string is the character number 1, not 0 as it appears in the substring statements.
I always forget this one keep thinking of it like arrays in C
Quote:
it should be related to the fact that the length of $0 changes upon each substitution
I did the reverse loop to allow for this fact that the length is changing and because adding the newline caused issues going forwards (will look into it further)
Quote:
This means that if you have some special character in the substring, it will be treated with the regexp rules and doubtfully the substring will match. For example just a little plus sign can change the meaning of the expression and the string will not match anymore.
And that bit just sux. Shows I had not read manual closely enough. Valuable info in this one for me that I will have to try and remember
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.