LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 03-09-2017, 10:23 AM   #1
vincix
Senior Member
 
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240

Rep: Reputation: 103Reputation: 103
Next in sed \n vs \n*


cat script_test
Code:
N
s/ *\n/ /
cat operator2
Code:
Consult Section 3.1 in the Owner and Operator
Guide for a description of the tape drives
available on your system.

Look in the Owner and Operator Guide shipped with your system.

Two manuals are provided including the Owner and
Operator Guide and the User Guide.

The Owner and Operator Guide is shipped with your system.
sed -f script_test operator2
Code:
Consult Section 3.1 in the Owner and Operator Guide for a description of the tape drives
available on your system.
Look in the Owner and Operator Guide shipped with your system.
Two manuals are provided including the Owner and Operator Guide and the User Guide.
 The Owner and Operator Guide is shipped with your system.
But if I change script_test to:
Code:
N
s/ *\n*/ /
Then when I run sed -f script_test operator2:
Code:
 Consult Section 3.1 in the Owner and Operator
Guide for a description of the tape drives
 available on your system.

 Look in the Owner and Operator Guide shipped with your system.

 Two manuals are provided including the Owner and
Operator Guide and the User Guide.
 The Owner and Operator Guide is shipped with your system.
This behaviour doesn't make much sense to me. The point was, I guess, to see if I can convert it all to one line (I don't need another solution other than sed, I simply want to understand how sed works in this case).

Last edited by vincix; 03-09-2017 at 10:27 AM.
 
Old 03-09-2017, 11:31 AM   #2
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,794

Rep: Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220
The regex " *\n*" (zero or more spaces followed by zero of more newlines) matches the null string at the beginning of each line. The command then replaces that null string with a space. What you would want is " *\n\+" (zero or more spaces followed by one or more newlines). But, since you have only two lines in the pattern space (just one "N" command), there can never be more than one embedded newline to replace. If you want to convert the whole file to one line, you first have to append all the lines and then make your substitution global.

(No, I don't feel like working out the details of doing that just now.)
 
1 members found this post helpful.
Old 03-10-2017, 04:25 AM   #3
vincix
Senior Member
 
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240

Original Poster
Rep: Reputation: 103Reputation: 103
I don't understand the 'two lines in the pattern space' part and how this is determined by the fact that there's only one "N". Can you expand on that?
 
Old 03-10-2017, 05:30 AM   #4
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 23,010

Rep: Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627
N is: append the next line of input into the pattern space.
so if there was only one N you appended only one line to the pattern space, therefore you will have (only) two lines in the pattern space.
 
Old 03-10-2017, 05:33 AM   #5
vincix
Senior Member
 
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240

Original Poster
Rep: Reputation: 103Reputation: 103
So basically what N does is to make the newline manipulable so that you can change \n however you wish to, right? Normally, \n is ignored by sed, is that it?
 
Old 03-10-2017, 06:24 AM   #6
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 23,010

Rep: Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627
no, "normally" \n is not part of the line, therefore will not be "available" in the pattern space. The command N will join lines (= append the next line of input into the pattern space) and will add a \n in between.
 
1 members found this post helpful.
Old 03-10-2017, 07:23 AM   #7
vincix
Senior Member
 
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240

Original Poster
Rep: Reputation: 103Reputation: 103
I've been struggling to understand this goddamn mechanism for some time now. And I read several definitions and I simply couldn't imagine it. I might be midly dyslexic or something, as I like to believe that I've understood more difficult things (linux related and so on). I needed a visual cue or some sorts. Anyway, your explanation made it much clearer now I'll tinker a little bit with it and I'll probably get back with some questions regarding the main subject.
 
Old 03-10-2017, 10:48 AM   #8
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 23,010

Rep: Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627
ok, fine, we are waiting for the next one....
 
Old 03-11-2017, 04:41 AM   #9
Jjanel
Member
 
Registered: Jun 2016
Distribution: any&all, in VBox; Ol'UnixCLI; NO GUI resources
Posts: 999
Blog Entries: 12

Rep: Reputation: 364Reputation: 364Reputation: 364Reputation: 364
man fmt ? YES! That's IT!!! -w

I spent several DAYS trying [successfully!] to 'grok' N \n !
Here's something: (t&b=goto :a OMyGosh! [G-rated!])

seq 6 | sed -e :a -e '$!N; s/\n/ /; ta' ### OR: man >fmt< ? YES!
seq 6 | sed ':a;N;$!ba;s/\n/ /g' ### here with a 1044rating wow!

Best wishes! 'Take your time!' (I do, savoring these puzzles!)

Last edited by Jjanel; 03-11-2017 at 05:15 AM.
 
Old 03-11-2017, 05:10 AM   #10
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 23,010

Rep: Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627Reputation: 7627
actually sed handles \n as special char, so if you want to avoid that better to use another tool (for example perl, awk, python, tr, whatever).
 
1 members found this post helpful.
Old 03-11-2017, 09:54 AM   #11
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,794

Rep: Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220
Quote:
Originally Posted by Jjanel View Post
I spent several DAYS trying [successfully!] to 'grok' N \n !
You're just going about it the hard way. Joining all the lines in sed is actually pretty easy:
Code:
sed -n 'H;${x;s/\n\+/ /g;p}'
Translation: Append each input line to the hold space. On the last line, exchange the hold space and the pattern space, then replace every occurrence of one or more newline characters with a single space and print the result.
 
1 members found this post helpful.
Old 03-11-2017, 02:28 PM   #12
vincix
Senior Member
 
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240

Original Poster
Rep: Reputation: 103Reputation: 103
I'm happy that this thread is the cause of such fruitful endeavour, but I am going to 'take it down a notch' and ask you something much more trivial.
script_test:
Quote:
N
s/\n*/[*]/
operator3:
Code:
first line
second line
third line
fourth line
fifth line
secs
sieben
acht
neun
zehn
sed -f script_test operator3:
Code:
[*]first line
second line
[*]third line
fourth line
[*]fifth line
secs
[*]sieben
acht
[*]neun
zehn
I understand that if I use it with a simple \n (without *), it takes every two lines and joins them with whatever you tell sed to substitute \n with. So in this case it matches the null character at the beginning of every two lines. But why does it do this every two lines exactly?
For instance, what does the second line have to do with the first line, in order for sed to ignore it and not prefix a "[*]"? I don't understand the exact logic (I hope I was articulate enough).
 
Old 03-11-2017, 05:42 PM   #13
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,794

Rep: Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220
It's not performing the test on every line. It matches null string at the start of the pattern buffer, which, due to the "N" command, contains more than a single line. The newline character actually isn't special to sed, except insofar as it delimits, and is stripped from, the lines as read, and gets inserted as a separator when lines are appended to the pattern buffer or hold buffer.
 
Old 03-11-2017, 05:48 PM   #14
GazL
LQ Veteran
 
Registered: May 2008
Posts: 7,053

Rep: Reputation: 5205Reputation: 5205Reputation: 5205Reputation: 5205Reputation: 5205Reputation: 5205Reputation: 5205Reputation: 5205Reputation: 5205Reputation: 5205Reputation: 5205
Lets dry run it.

Say we have the input file:
Code:
one
two
three
four
five
six
And we're running sed -e 'N;s/\n*/(*)/' (note: I had to change the square brackets because they were messing up the bbcode in the forum.)

So, sed starts and implicitly reads the next line of the input file into the pattern buffer.
The buffer will now contain: 'one' (note: no terminating '\n' on the buffer)

Next, it does the 'N' operation, which appends the next line of input onto the end of the buffer. The buffer now contains: 'one\ntwo' (sed inserts a '\n' between the two lines in the buffer, but there's still no terminating '\n' at the end of the buffer).

Next, it does the 's/\n*/(*)/' which instructs it to swap the first occurrence of zero or more '\n' characters to the literal string '(*)'. The first occurrence of zero '\n' characters is at the beginning of the line, so the buffer now contains: '(*)one\ntwo'

It now gets to the end of the script and does an implicit print of the buffer. It prints '(*)one\ntwo' to the terminal, which displays as
Code:
(*)one
two
It now clears the input buffer, does another implicit read of the next line, loading 'three' into the buffer, and starts again from the 'N' operation, just as before.

Rinse and repeat.

I suspect part of your confusion is stemming from the fact you are confusing the meaning of '*' in a regex, with the meaning of '*' in a shell glob. They are quite different.

Some examples:
Code:
test@ws1:~$ cat /tmp/sed.in 
one
two
three
four
five
six
test@ws1:~$ sed -e 'N;s/\n*/(*)/' /tmp/sed.in 
(*)one
two
(*)three
four
(*)five
six
test@ws1:~$ sed -e 'N;s/\n.*/(*)/' /tmp/sed.in 
one(*)
three(*)
five(*)
test@ws1:~$ sed -e 'N;s/.*\n/(*)/' /tmp/sed.in 
(*)two
(*)four
(*)six
test@ws1:~$
I hope that clears it up for you.

Last edited by GazL; 03-11-2017 at 05:54 PM.
 
1 members found this post helpful.
Old 03-11-2017, 06:03 PM   #15
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,794

Rep: Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220Reputation: 2220
Quote:
Originally Posted by GazL View Post
(note: I had to change the square brackets because they were messing up the bbcode in the forum.)
The magic incantation to prevent that when not using [CODE] ... [/CODE] tags is "[NOPARSE] ... [/NOPARSE]".
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
create file list: SED inline vs SED standalone, enormous speed difference Corsari Linux - Newbie 4 09-02-2013 04:01 AM
[SOLVED] Multipal line edited using sed, how to make sed specific coolpraz Programming 4 01-05-2013 02:14 PM
[Cygwin, sed] Using filenames as both files and search strings within sed lingh Linux - Newbie 5 10-20-2012 11:38 AM
[SOLVED] sed 's/Tb05.5K5.100/Tb229/' alone but doesn't work in sed file w/ other expressions Radha.jg Programming 6 03-03-2011 08:59 AM
Insert character into a line with sed? & variables in sed? jago25_98 Programming 5 03-11-2004 07:12 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 03:18 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration