ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Guess what, it's wrong. This rule doesn't work if the 4 digit number is at the end of a line.My bad, but it illustrates the importance of testing sed scripts before using them.
Here I added a second rule. You can have more than one rule on the same line by seperating them with a semicolon. For more complicated sed instructions, create a separate file.
You didn't make clear whether the number needs to be exactly 4 digits. You only gave one example that resembled a year. It is important using sed, awk or any regular expression to be as precise as you need to be. Otherwise you will either miss some replacements like my first attempt, or have a false positive match with could cause a replacement you don't want.
If you want to replace a space before any 4 digit number,
Thank you. I thought of that later. However, you're right. My true intent is to place a comma before a four digit string of numbers.
Quote:
Originally Posted by jschiwal
Code:
$ sed 's/ \([[:digit:]]\{4\}[^[:digit:]]\)/,\1/g;s/ \([[:digit:]]\{4\}\)$/,\1/'
sample
I hadn't seen posix character classes until last night. It looks like the way to go. I was able to identify four digit years with [0000-9999]; however, the replacement part didn't work out well. I can specify
Code:
sed 's/ .^[0000-9999]/ ,but how do keep the same numbers when I replace?
Quote:
Originally Posted by jschiwal
You didn't make clear whether the number needs to be exactly 4 digits. You only gave one example that resembled a year. It is important using sed, awk or any regular expression to be as precise as you need to be. Otherwise you will either miss some replacements like my first attempt, or have a false positive match with could cause a replacement you don't want.
Thank you again. This was helpful. I'll try to deconstruct it.
On a side note:
What if I wanted the comma before any size string of numbers? Is there a way without :digit: or setting variables?
how do I specify a line break? (I want to replace a line break and three tabs with the last entry on the line that did not consist of tabs.)
Some Guy- wrote this book 2007
{tabx3} wrote this book 2006
{tabx3} wrote another book 2005
then becomes
Some Guy, book a, 2007
Some Guy, book b, 2006
Some Guy, book c, 2005
Thanks again. I've been reading through O'reilly "Learning Sed and Awk", man pages, and some online articles, but the answers aren't always obvious (to me). Is there another resource that would be worth looking at?
sed 's/ .^[0000-9999]/ ,but how do keep the same numbers when I replace?
You keep the number by using groups. enclose whatever you want to keep in '()' then call it back using '\1', multiple groups will have multiple numbers \1 \2 \3..etc.. ex.
Code:
$ echo "foobar"|sed 's/.*\(oo\).*/m\1/'
moo
the 'oo' = group #1, however the whole match is replaced with 'm'+group1. hence 'moo'
This matches *space* then 0-9 (4 times). Only the 4 numbers are put into group one. The whole match (space + number) is replaced with ', '+group 1
Hope that makes sense..
Quote:
Originally Posted by donnied
What if I wanted the comma before any size string of numbers? Is there a way without :digit: or setting variables?
how do I specify a line break? (I want to replace a line break and three tabs with the last entry on the line that did not consist of tabs.)
1. [[:digit:]] == [0-9]. Not using either of those two will just be difficult. [0-9]+ will match 1 or more repetitions of [0-9]
2. ^ matches the start of a line $ matches the end of a line. '^some guy' matches lines that start with 'some guy'.
I'm not too good with awk/sed, but I'll try to give you an awk solution for your example in a bit...if i figure it out
Quote:
Originally Posted by donnied
Thanks again. I've been reading through O'reilly "Learning Sed and Awk", man pages, and some online articles, but the answers aren't always obvious (to me). Is there another resource that would be worth looking at?
those are the topic's at irc.freenode.net #awk #sed channels. Which is another great source of info if you're ever stuck trying to figure out something.
Edit:
woohoo! I did it.
Code:
$ cat sample
Some Guy- wrote this book 2007
wrote this book 2006
wrote another book 2005
other guy- wrote this book 2008
book b 2009
$ awk -F'- ' 'BEGIN {OFS="- "}
{if (!/^\t\t\t/) name=$1;
else {sub("^\t\t\t", "", $0);$2=$0;$1=name}}
{gsub(" [0-9][0-9][0-9][0-9]", ",&",$2);print $0}' "sample"
Some Guy- wrote this book, 2007
Some Guy- wrote this book, 2006
Some Guy- wrote another book, 2005
other guy- wrote this book, 2008
other guy- book b, 2009
Last edited by angrybanana; 09-30-2007 at 08:33 PM.
I've been reading through O'reilly "Learning Sed and Awk", man pages, and some online articles, but the answers aren't always obvious (to me). Is there another resource that would be worth looking at?
are you looking for the answers in that book? or are you learning how to get to the answers?
You keep the number by using groups. enclose whatever you want to keep in '()' then call it back using '\1', multiple groups will have multiple numbers \1 \2 \3..etc.. ex.
I'm not too good with awk/sed, but I'll try to give you an awk solution for your example in a bit...if i figure it out
Wow! Thank you for the information. You explained really well. That was an amazing amount of work you did. I appreciate it and it has helped my understanding.
I'll have to look at the example I posted to see why the 2007 isn't handled. I think that the "[^[:digit:]]" goobles up the space before the next number. Adding the first command again solves the problem.
Code:
sed 's/ \([0-9]\{4\}[^[0-9]\)/,\1/g;s/ \([0-9]\{4\}[^[0-9]\)/,\1/g;s/ \([[:digit:]]\{4\}\)$/,\1/' sample
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.