Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I'm really at a loss trying to understand sed's advanced functions, such as N, D, P, let alone g, G, H, h, etc. I've read quite a lot, but notions such as pattern space and hold space seem so abstract and weird, I really don't get them. Or maybe I do understand them in theory, but when it comes to using N and, especially, P along with D I simply don't understand what is going on. And I've read about it in three different sources, and all seem to explain it as if it's implicit, but it clearly isn't!
I don't even know where to begin, but I guess I should start with the difference between N and n.
Source lynda: "n prints the content of the pattern space, empties the pattern space, and gets the next line into the pattern space.
N doesn’t print the pattern space and it does not empty it, instead it ends a new line character (\n) to the end of the pattern space, then gets the next line and appends it to the pattern space after the new line. After this has been done, the pattern space, which now contains multiple lines, is available for more editing."
In the second definition, does 'append to the pattern space' mean that N prints the pattern space after it appends the new line?
I don't understand what emptying the pattern space implies exactly either. So n prints the pattern space and then it empties it, but N doesn't empty it. But in both situations lines (be they appended with a newline or not) do show in the final output. So I don't understand what the actual difference is.
What references have you read on this and what experiments have you done with the various options of sed? You cited a great deal of sed command options in your opening statement. I do realize it is a large topic, with a variety of answers, which I can see by just searching for information on "sed N".
Therefore a suggestion is to try each command and evaluate what you find, along with the official sed documentation and then inquire where there is confusion. Much like we see in this question, the OP tried a few things and surmised what they think they were seeing. The top answer cited sed documentation and then went on with a lot of detail to explain the use of N and how it would behave.
I therefore would attack learning these extra details in a similar manner. Try some experiments, re-review the documentation, build up some level of fundamental understanding, and then ask more specific questions.
I also have no idea what "source lynda" represents, which is part of one of your descriptions.
I've already read that link, too. I'm missing something basic regarding the way sed moves text around, but never mind. I'll probably have to deal with it myself, in time.
Writing a sed script requires a different mindset, and can be quite bewildering if you've never been exposed to the concept of a Turing machine or perhaps taken a look at the Ook! programming language. Once you get the hang of it, it's not hard to understand and is quite powerful (it's Turing-complete), which does not mean that it's easy to use for anything non-trivial.
The sed info document ("info sed" in a terminal) is actually well written and quite complete, though navigating in an info document can be quite a challenge itself at first. Quote from the sed info page: sed programs:: => Execution Cycle::
`sed' maintains two data buffers: the active _pattern_ space, and the auxiliary _hold_ space. Both are initially empty.
`sed' operates by performing the following cycle on each lines of input: first, `sed' reads one line from the input stream, removes any trailing newline, and places it in the pattern space. Then commands are executed; each command can have an address associated to it: addresses are a kind of condition code, and a command is only executed if the condition is verified before the command is to be executed.
When the end of the script is reached, unless the `-n' option is in use, the contents of pattern space are printed out to the output stream, adding back the trailing newline if it was removed. Then the next cycle starts for the next input line.
Unless special commands (like `D') are used, the pattern space is deleted between two cycles. The hold space, on the other hand, keeps its data between cycles (see commands `h', `H', `x', `g', `G' to move data between both buffers).
The "N" command does not, itself, print the pattern space, but if nothing else happens to the pattern buffer before the end of the script is reached, then the whole buffer, now containing two (or more) lines, is going to be printed anyway. (Note that all automatic printing can be disabled by the "-n" option.)
The pattern space usually contains just a single line, but if you have used the "N" command (or other means) to put more than one line in the pattern space, then you can use the "D" or "P" command to delete or print that portion of that buffer up to the first embedded newline.
$ echo "cat dog cat dog cat dog" | sed 's/cat/dog/'
dog dog cat dog cat dog
$ echo "cat dog cat dog cat dog" | sed 's/cat/dog/g'
dog dog dog dog dog dog
Without the /g it does the first one for the substitution 's///'. With /g it does them ALL.
That's a different context. When used elsewhere, g replaces the contents of the pattern space with the contents of the hold space, allowing some juggling of data. Here one line at a time passes through the hold space but swapped out and (unless it is the first line) printed if a pattern is found.
Code:
# print the line immediately before a regexp, but not the line
# containing the regexp
sed -n '/regexp/{g;1!p;};h'
You're doing the first step that I do, which is to pick one ... and research it on the web.
But the second step is to try out / experiment with all the possibilities for just that one thing, via web searches, starting from simplest first, until you hit one tiny snippet of code, containing the *MOST* minimal way to demonstrate what you don't understand and then post that in CODE tags. Maybe add key excerpts (quotes) from man/docs or links. Try it: LQ'ers will help you thru getting the knack of doing this process!
Also, step "zero" is to believe! IF you believe you can't, you won't! Research the philosophy: thoughts manifest!
It's great to see studying; just try to keep the LQgurus happy tho
IF you just say: "your darn space shuttle is way too hard for me to pilot", you will just get in-kind emotional responses. If instead, you post fair-good step2above, you will get good-great same in return, and your learning goal will be achieved
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.