LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 07-28-2012, 12:54 AM   #1
patrick295767
Member
 
Registered: Feb 2006
Distribution: FreeBSD, Linux, Slackware, LFS, Gparted
Posts: 664

Rep: Reputation: 138Reputation: 138
Question awk to remove the end considering the last field?


Hi,

I would like to remove using awk what is after the last matching field:

Code:
echo "my documents here that are made (bla).doc" | awk ...
here the field for the example would be the space before "made" and "(bla)"

Desired output/ wished output:
Code:
"my documents here that are made"
Any ideas would be very welcome !

Thanks

Last edited by patrick295767; 07-28-2012 at 12:55 AM.
 
Old 07-28-2012, 01:21 AM   #2
firstfire
Member
 
Registered: Mar 2006
Location: Ekaterinburg, Russia
Distribution: Debian, Ubuntu
Posts: 709

Rep: Reputation: 428Reputation: 428Reputation: 428Reputation: 428Reputation: 428
Hi.

Code:
$ echo "my documents here that are made (bla).doc" | awk -F ' +[(]bla' '{print $1}'
my documents here that are made
$ echo "my documents here that are made (bla).doc" | sed 's/ *(bla.*//'
my documents here that are made
SED looks better suited for this problem.
 
Old 07-28-2012, 02:51 AM   #3
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
Not sure if the data at the end needs to be checked, but if not:
Code:
echo "my documents here that are made (bla).doc" | awk '$NF="\0"'
 
1 members found this post helpful.
Old 07-28-2012, 10:15 AM   #4
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
grail gave you awk. Just set $NF to null and print.

firstfire gave you sed as well, although I'd make the regex a bit more generic: just strip everything from the last space to the end.

Code:
echo "my documents here that are made (bla).doc" | sed 's/ [^ ]*$//'
Or if the string is, or can be, stored in a shell variable first, then a simple parameter substitution can also be used.

Code:
$ text="my documents here that are made (bla).doc"
$ echo "${text% *}"
 
1 members found this post helpful.
Old 07-28-2012, 11:16 PM   #5
dru8274
Member
 
Registered: Oct 2011
Location: New Zealand
Distribution: Debian
Posts: 105

Rep: Reputation: 37
Quote:
Originally Posted by grail View Post
Code:
echo "my documents here that are made (bla).doc" | awk '$NF="\0"'
I am fairly new to awk... could you explain why that works please? TIA.
 
Old 07-28-2012, 11:48 PM   #6
amboxer21
Member
 
Registered: Mar 2012
Location: New Jersey
Distribution: Gentoo
Posts: 291

Rep: Reputation: Disabled
I do not like sed. Its way too ugly IMO.

Why not use a field separator, print everything before it, and store it in a variable that you can manipulate later if needed?

Code:
var=$(echo "my documents here that are made (bla).doc" | awk -F"(" '{print $1}'); echo $var
EDIT:
If you want quotes, then:
Code:
echo "my documents here that are made (bla).doc" | awk -F"(" '{print "\""$1"\""}'

Last edited by amboxer21; 07-29-2012 at 08:39 PM.
 
Old 07-29-2012, 04:27 AM   #7
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Quote:
Originally Posted by amboxer21 View Post
I do not like sed. Its way too ugly IMO.

Why not use a field separator, print everything before it, and store it in a variable that you can manipulate later if needed?

Code:
var=$(echo "my documents here that are made (bla).doc" | awk -F"(" '{print $1}'); echo $var
I can't see sed as "ugly". It's just a tool that applies regular expressions to lines of text. Quite simple and efficient, for the most part. I admit that sometimes the expressions it uses can get a bit complex, but that's a different thing, and awk can also be just as cryptic, depending on the job.

sed also has an advantage over awk in that awk's default field splitting doesn't preserve multiple whitespace characters.

The awk solution you gave does avoid that, since it uses a non-whitespace delimiter, but it now depends on there being a parentheses in the line, which is not necessarily a given according to the OP description. Indeed, he specifically stated that he wanted to remove the last space-delimited field.

Speaking of which, both yours and grail's solutions end up leaving that extra space tacked onto the end of the output. This is probably unwanted behavior.

Finally, if we're going to store the value in a variable anyway, just use the parameter substitution I gave earlier. It's even cleaner and much more efficient than either of the external tools.
 
1 members found this post helpful.
Old 07-29-2012, 10:50 AM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
Quote:
Originally Posted by dru8274
I am fairly new to awk... could you explain why that works please?
As David has mentioned the default FS in awk is white space and all fields left after the splitting on white space are referenced by a number. Like FS, NF is another awk variable
which is equal to the number of fields created. By then placing the $ sign in front of NF we now reference the last field in the list and set it to null.

As an addition to what David has already mentioned, the awk output once assigned to a variable would also get rid of the pesky space at the end

If the data needs to be delivered without assignment, we could change it like so:
Code:
echo "my documents here that are made (bla).doc" | awk '$NF="\010"'
 
Old 07-29-2012, 02:50 PM   #9
patrick295767
Member
 
Registered: Feb 2006
Distribution: FreeBSD, Linux, Slackware, LFS, Gparted
Posts: 664

Original Poster
Rep: Reputation: 138Reputation: 138
Lot of attempts, looks like it is difficult to have enough xp to make it with awk.

If I recall well, it is fairly possible with awk

| awk with NF is on good way, and we need to add : -f " " to define as delimiter the space.
 
Old 07-29-2012, 08:30 PM   #10
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
Quote:
| awk with NF is on good way, and we need to add : -f " " to define as delimiter the space.
Not sure what you are trying to get at here? The awk solutions do work, all to differing levels I will agree. Also, why would you need to reset the
delimiter when it is already defaulting to white space?
 
Old 07-29-2012, 08:43 PM   #11
amboxer21
Member
 
Registered: Mar 2012
Location: New Jersey
Distribution: Gentoo
Posts: 291

Rep: Reputation: Disabled
Grail is right

this:
Code:
awk -F" " '{ }'
is the same as:
Code:
awk '{ }'
The default delim for Awk is white space.

I have a question for david. Adding double quotations around the output with awk would be trivial '{ print "\""$1"\""}'. But how would you add double quotations to your provided sed example?

Code:
echo "my documents here that are made (bla).doc" | sed 's/ [^ ]*$//'

Last edited by amboxer21; 07-29-2012 at 08:45 PM.
 
Old 07-30-2012, 02:30 AM   #12
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
If I may:
Code:
echo "my documents here that are made (bla).doc" | sed -r 's/(.*) [^ ]*$/"\1"/'
I would add that setting the delimiter in awk to exactly a space will not yield the same results all the time as the default FS is uniq in that it will gobble up
all white space and also remove any from the start of the first record, which if you are using spaces to say there are several fields missing at the start
is not what you would want. I do not see the issue occurring in the current case / example.
 
Old 07-30-2012, 12:10 PM   #13
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
grail has answered the last one for me. Modifying the output now means extracting a substring and adding to it, rather than just stripping off the unwanted part and printing the rest. So we use a set of capturing parentheses and a \n backreference to extract it and print out the desired part with quotemarks attached*.

The regex can also be made slightly simpler now, however. Due to the greedy behavior of "*", it's not necessary to use a negating character class or an anchor.
Code:
echo "my documents here that are made (bla).doc" | sed -r 's/(.*) .*/"\1"/'
This is all standard regex stuff. You really should study up on it. You'll be glad you did. Learning how to use regular expressions effectively is, IMO, the single most useful thing I studied when learning scripting.

Here are a few regular expressions tutorials:
http://mywiki.wooledge.org/RegularExpression
http://www.grymoire.com/Unix/Regular.html
http://www.regular-expressions.info/


*You do need to be aware of how the shell processes quotes here too. The single quotes around the entire expression escape the double quotes inside them, so that the shell passes them on literally to sed.

http://mywiki.wooledge.org/Arguments
http://mywiki.wooledge.org/WordSplitting
http://mywiki.wooledge.org/Quotes
 
Old 07-31-2012, 07:31 PM   #14
amboxer21
Member
 
Registered: Mar 2012
Location: New Jersey
Distribution: Gentoo
Posts: 291

Rep: Reputation: Disabled
Sorry for high jacking the thread lol but I havent seen the OP ask any questions. So, while I have 2 great members already here, I figure I ask a question. I have been reading tutorials on sed and this tool is crazy awesome! There is so much to take in and memorize! I have a good understanding of awk already but sed seems so much more powerful! I could do more with less!

So, the question is; What would you reccomend as a beginner Sed project to reinforce the rules and tricks of the language/tool?
 
Old 07-31-2012, 08:20 PM   #15
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Quote:
Originally Posted by amboxer21 View Post
I have a good understanding of awk ... What would you recommend as a beginner Sed project to reinforce the rules and tricks of the language/tool?
Suggestion: take any code you wrote which contains non-trivial awks and write a derivative version in which some of those awks are replaced with functionally equivalent seds. Then, make careful timings to determine which version runs faster. Compare the code to decide which version is more readable. Post your results on this forum.

Daniel B. Martin
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] using awk and comparing a field's value oreka18 Programming 3 05-13-2012 07:04 AM
awk error awk: line 2: missing } near end of file boscop Linux - Networking 2 04-08-2012 10:49 AM
[SOLVED] awk: how to print a field when field position is unknown? elfoozo Programming 12 08-18-2010 03:52 AM
awk printing from Nth field to last field sebelk Programming 2 01-08-2010 09:39 AM
AWK: print field to end, and character count? ridertech Linux - Newbie 1 05-07-2004 05:07 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:41 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration