LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   sed: remove newline except when it's a blank line (http://www.linuxquestions.org/questions/linux-newbie-8/sed-remove-newline-except-when-it%27s-a-blank-line-928277/)

muzzol 02-08-2012 07:17 AM

sed: remove newline except when it's a blank line
 
hi,

i have a text like this:

Code:

on
two
three

four
five

six
seven
eight

i want to remove newline (\n) except when the line is blank (^$). i've tried several examples without any luck.

David the H. 02-08-2012 09:34 AM

What exactly have you tried so far?

Anyway, there may be better solutions, but I got it to work this way. You use "N" to append the next line to the one in the buffer, then replace the newline between them with a space, but only if it appears between actual blocks of text. The whole thing is nested in a "t" loop in order to run the process as many times as needed.

Code:

sed -r ':a; N ;s/(.+)\n(.+)/\1 \2/; ta' file
Here are a few useful sed references.
http://www.grymoire.com/Unix/Sed.html
http://sed.sourceforge.net/grabbag/
http://sed.sourceforge.net/sedfaq.html
http://sed.sourceforge.net/sed1line.txt

muzzol 02-08-2012 10:41 AM

i was trying something like "if line is blank, do nothine, else remove newlines":

Code:

sed '/^$/! s/\n//g'

colucix 02-08-2012 10:49 AM

Quote:

Originally Posted by muzzol (Post 4597092)
i was trying something like "if line is blank, do nothine, else remove newlines":

The newline is not inserted in the pattern space, unless you put multiple lines together with the N command, as David the H. demonstrated above.

David the H. 02-11-2012 05:37 AM

As colucix said. sed commands are always applied to the current contents of the pattern buffer. One line is taken a time into this buffer (minus the delimiting newline), the commands in the expression are applied to it, and the modified contents are printed out (unless told not to). Then the contents are cleared and replaced with the next line. address ranges filter which lines get grabbed, but don't otherwise affect the basic processing sequence.

In order to operate on multiple lines at once, and the newline characters between them, you have to tell sed to store more than one line in the pattern buffer at a time. This is accomplished with the N command, and/or use of the hold buffer. "N" tells sed to append the next line of text to the current buffer, separated by a newline. This is the only time there's actually a newline in the buffer to target.

In this specific case you also need some kind of loop or hold buffer action to continue to add new lines until the blank line condition is reached, otherwise it would only process a single newline, then empty the buffer for the next line.

The hold buffer is a separate bit of swap space that can be used for temporary text storage, and there are various commands for appending to/swapping out text with the pattern buffer. sed expressions can get horribly complex with it, and trying to keep straight exactly what it's doing always makes my head hurt.

Read the grymoire link I gave earlier for more details on how it works.

grail 02-11-2012 07:16 AM

You could let awk at it too:
Code:

awk 'ORS=NF?"\0":RT"\n"' file

muzzol 02-12-2012 02:52 PM

Quote:

Originally Posted by David the H. (Post 4599613)
Read the grymoire link I gave earlier for more details on how it works.

thanks for your usefull explanation.


All times are GMT -5. The time now is 11:04 PM.