ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
$ echo "my documents here that are made (bla).doc" | awk -F ' +[(]bla' '{print $1}'
my documents here that are made
$ echo "my documents here that are made (bla).doc" | sed 's/ *(bla.*//'
my documents here that are made
Why not use a field separator, print everything before it, and store it in a variable that you can manipulate later if needed?
Code:
var=$(echo "my documents here that are made (bla).doc" | awk -F"(" '{print $1}'); echo $var
I can't see sed as "ugly". It's just a tool that applies regular expressions to lines of text. Quite simple and efficient, for the most part. I admit that sometimes the expressions it uses can get a bit complex, but that's a different thing, and awk can also be just as cryptic, depending on the job.
sed also has an advantage over awk in that awk's default field splitting doesn't preserve multiple whitespace characters.
The awk solution you gave does avoid that, since it uses a non-whitespace delimiter, but it now depends on there being a parentheses in the line, which is not necessarily a given according to the OP description. Indeed, he specifically stated that he wanted to remove the last space-delimited field.
Speaking of which, both yours and grail's solutions end up leaving that extra space tacked onto the end of the output. This is probably unwanted behavior.
Finally, if we're going to store the value in a variable anyway, just use the parameter substitution I gave earlier. It's even cleaner and much more efficient than either of the external tools.
I am fairly new to awk... could you explain why that works please?
As David has mentioned the default FS in awk is white space and all fields left after the splitting on white space are referenced by a number. Like FS, NF is another awk variable
which is equal to the number of fields created. By then placing the $ sign in front of NF we now reference the last field in the list and set it to null.
As an addition to what David has already mentioned, the awk output once assigned to a variable would also get rid of the pesky space at the end
If the data needs to be delivered without assignment, we could change it like so:
Code:
echo "my documents here that are made (bla).doc" | awk '$NF="\010"'
| awk with NF is on good way, and we need to add : -f " " to define as delimiter the space.
Not sure what you are trying to get at here? The awk solutions do work, all to differing levels I will agree. Also, why would you need to reset the
delimiter when it is already defaulting to white space?
I have a question for david. Adding double quotations around the output with awk would be trivial '{ print "\""$1"\""}'. But how would you add double quotations to your provided sed example?
Code:
echo "my documents here that are made (bla).doc" | sed 's/ [^ ]*$//'
echo "my documents here that are made (bla).doc" | sed -r 's/(.*) [^ ]*$/"\1"/'
I would add that setting the delimiter in awk to exactly a space will not yield the same results all the time as the default FS is uniq in that it will gobble up
all white space and also remove any from the start of the first record, which if you are using spaces to say there are several fields missing at the start
is not what you would want. I do not see the issue occurring in the current case / example.
grail has answered the last one for me. Modifying the output now means extracting a substring and adding to it, rather than just stripping off the unwanted part and printing the rest. So we use a set of capturing parentheses and a \nbackreference to extract it and print out the desired part with quotemarks attached*.
The regex can also be made slightly simpler now, however. Due to the greedy behavior of "*", it's not necessary to use a negating character class or an anchor.
Code:
echo "my documents here that are made (bla).doc" | sed -r 's/(.*) .*/"\1"/'
This is all standard regex stuff. You really should study up on it. You'll be glad you did. Learning how to use regular expressions effectively is, IMO, the single most useful thing I studied when learning scripting.
*You do need to be aware of how the shell processes quotes here too. The single quotes around the entire expression escape the double quotes inside them, so that the shell passes them on literally to sed.
Sorry for high jacking the thread lol but I havent seen the OP ask any questions. So, while I have 2 great members already here, I figure I ask a question. I have been reading tutorials on sed and this tool is crazy awesome! There is so much to take in and memorize! I have a good understanding of awk already but sed seems so much more powerful! I could do more with less!
So, the question is; What would you reccomend as a beginner Sed project to reinforce the rules and tricks of the language/tool?
I have a good understanding of awk ... What would you recommend as a beginner Sed project to reinforce the rules and tricks of the language/tool?
Suggestion: take any code you wrote which contains non-trivial awks and write a derivative version in which some of those awks are replaced with functionally equivalent seds. Then, make careful timings to determine which version runs faster. Compare the code to decide which version is more readable. Post your results on this forum.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.