ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I have a question so easy that I'm embarrassed to write it up as a post on this forum. I want sed to substitute the newline character with the tab character on any non-blank line. Feeling like that was WELL within my admittedly beginner level bash scripting abilities, I typed
The first regexp tells sed to look on blank lines, in which case it is instructed to NOT substitute \n for \t. I really wish the problem was more complicated than this; but its not. It seems like it should be relatively easy and straight forward, however, the above command simply doesnt perform as expected; it literally does not effect the output. Any thoughts?
As an addendum, I really do try to solve these for myself before coming to this forum; and it pains me to come to you guys with such a simple problem, but I can't figure it out.
I want sed to substitute the newline character with the tab character on any non-blank line.
This problem statement is not entirely clear. Perhaps English is not OP's first language, and for that we should make allowances. For better comprehension, I reword the problem as follows:
"I want to substitute a tab character for the newline character on any non-blank line."
If this interpretation is correct, all blank lines in an input file should be left unchanged. Let's remember that a "blank line" could consist of zero or more blank characters.
I learn by reading forum posts and solving when I can. I also test proposed solutions offered by others. This is a test file constructed for the purpose:
This is a non-blank line with no trailing blanks.
This is a non-blank line with two trailing blanks.
The following line is blank.
The preceding line was blank.
The following THREE lines are blank.
The preceding THREE lines were blank.
This is the last line in the file.
Using this test file, all previous solutions fail.
$ echo -e "a\nb\nc\n\nd e\n\n\nf"|sed ':a /^$/!N;s/\n/\t/; ta'
a b c d e f
I think the problem is that the /^$/ test matches the entire pattern space, not just the new line. So,
let's have a file like this:
Now let's feed it to the sed:
1) sed reads the first line: a
2) /^$/ will not match, so !N will append next line to the pattern space.
3) The contents of the pattern space is now "a\nb".
4) s/\n/\t/ will replace the \n with \t and ta will return the flow back to the begining of the expression.
5) /^$/ will not match again, next line (blank) is read and appended (with a newline) to the pattern space.
6) The pattern space now contains "a\tb\n" and the substitution will again succeed.
7) /^$/ will not match (it never will except if the first line is blank), the last line is read and appended to the pattern space, newline gets substituted
8) reaches the end of input and the entire pattern space is sent to stdout: a\tb\t\t\c\n
It reads whole file into hold space and then substitutes tabs for newlines. The red part allows to keep the number of blank lines, otherwise this number is decreased by 1 (which I believe is what OP wanted, because otherwise there is no way to get output without blank lines).
Last edited by firstfire; 12-05-2012 at 01:16 PM.
Reason: Fixed solution.