sed help plz

rjkfsm · 05-12-2009, 10:57 AM

I am looking to use sed in a script. I have searched through the docs and cannot find what I am looking for.

If I have an email address like frustrated_sed_user@smtp1.mail.linux.com. How do I get the topmost domain out? (ie linux.com)

RK

MensaWater · 05-12-2009, 11:01 AM

awk is a better tool for extracting fields:

Code:

echo frustrated_sed_user@smtp1.mail.linux.com |awk -F. '{print $3"."$4}'

The "-F." is telling awk to use dot as the field separator instead of white space. The dot in quotes in the print section is adding the dot back to the output as awk stripped it out when it broke the fields up. Fields 3 and 4 are being printed before and after the dot respectively.

rjkfsm · 05-12-2009, 11:36 AM

Well, I can tell I'm warmer...

How do I do a reverse search?

If I do:

Code:

echo frustrated_sed_user@smtp1.mail.linux.com |awk -F. '{print $3"."$4}'

I get linux.com just like I want, but if I have:

Code:

echo frustrated_sed_user@smtp1.linux.com |awk -F. '{print $3"."$4}'

I get com.

Further clarification: I'm looking for a more general email address filter, not something that only works with that one string.

Oh and thank you very much for your reply. I think awk may be what I need, but geez that's a lot of documentation.

RK

MensaWater · 05-12-2009, 11:42 AM

Code:

echo frustrated_sed_user@smtp1.mail.linux.com |awk -F. '{print $(NF-1)"."$NF}'

NF = number of fields so $NF would be last field and $(NF-1) would be field before last field. So long as you have at least 2 fields it should work for any address that ends in the domain you want.

pixellany · 05-12-2009, 02:12 PM

Really good tutorials here---SED, AWK, and more:
http://www.grymoire.com/Unix/

Beats reading man pages.....

syg00 · 05-12-2009, 04:55 PM

Being not well versed in awk, I only use it when data is (extremely) well structured - as in the cases above.
Must admit I have a leaning toward regex in that it can be used to extract data from anywhere in a record - building a (fool-proof) regex for this could get challenging though.
Perl might be a better option than sed.

Kenhelm · 05-12-2009, 06:56 PM

Another method

Code:

echo '
frustrated_sed_user@smtp1.mail.linux.com
frustrated_sed_user@smtp1.linux.com
frustrated_sed_user@linux.com' | grep -o '[^.@]*\.[^.]*$'

linux.com
linux.com
linux.com

H_TeXMeX_H · 05-13-2009, 05:36 AM

Quote:

Originally Posted by rjkfsm

Well, I can tell I'm warmer...

How do I do a reverse search?

If I do:

Code:

echo frustrated_sed_user@smtp1.mail.linux.com |awk -F. '{print $3"."$4}'

I get linux.com just like I want, but if I have:

Code:

echo frustrated_sed_user@smtp1.linux.com |awk -F. '{print $3"."$4}'

I get com.

Further clarification: I'm looking for a more general email address filter, not something that only works with that one string.

Oh and thank you very much for your reply. I think awk may be what I need, but geez that's a lot of documentation.

RK

I wouldn't use awk here, but you can. Here's how I would do it to make it more useful:

Code:

bash-3.1$ echo frustrated_sed_user@smtp1.mail.linux.com | rev | cut -d . -f 1
moc
bash-3.1$ echo frustrated_sed_user@smtp1.mail.linux.com | rev | cut -d . -f 1 | rev
com
bash-3.1$ echo frustrated_sed_user@smtp1.mail.linux.com | rev | cut -d . -f 2 | rev
linux
bash-3.1$ echo frustrated_sed_user@smtp1.mail.linux.com | rev | cut -d . -f 1-2 | rev
linux.com
bash-3.1$ echo frustrated_sed_user@smtp1.linux.com | rev | cut -d . -f 1-2 | rev
linux.com

So basically using 'rev' is a good idea here. It reverses lines character by character.

ghostdog74 · 05-13-2009, 06:35 AM

Quote:

Originally Posted by H_TeXMeX_H

I wouldn't use awk here, but you can. Here's how I would do it to make it more useful:

but here, you make extra calls to rev, cut.

ghostdog74 · 05-13-2009, 06:39 AM

Quote:

Originally Posted by syg00

I have a leaning toward regex in that it can be used to extract data from anywhere in a record - building a (fool-proof) regex for this could get challenging though.

well, i don't know why you think without regex you can't extract data from anywhere in a record

H_TeXMeX_H · 05-13-2009, 07:12 AM

Quote:

Originally Posted by ghostdog74

but here, you make extra calls to rev, cut.

The only extra call is to 'rev'. You can use awk instead of cut:

Code:

bash-3.1$ echo frustrated_sed_user@smtp1.mail.linux.com | rev | awk -F. '{print $1"."$2}'| rev
linux.com

Or you could use just awk:

Code:

bash-3.1$ echo frustrated_sed_user@smtp1.mail.linux.com | awk -F. '{print $(NF-1)"."$NF}'
linux.com

as jlightner said earlier

NF is the number of fields, so NF is the last field, and NF-1 is the next to last field.

Whichever way you want to do it, there are so many ways.

ghostdog74 · 05-13-2009, 07:19 AM

Quote:

Originally Posted by H_TeXMeX_H

The only extra call is to 'rev'. You can use awk instead of cut:

Code:

bash-3.1$ echo frustrated_sed_user@smtp1.mail.linux.com | rev | awk -F. '{ print $1"."$2}'| rev
linux.com

still the same thing. you have to call rev 2 times, awk 1 time. See post #4 by jlightner. that's common way to get awk fields from the back.

Code:

# time echo frustrated_sed_user@smtp1.mail.linux.com |awk -F. '{print $(NF-1)"."$NF}'
linux.com

real    0m0.004s
user    0m0.004s
sys     0m0.000s

# time echo frustrated_sed_user@smtp1.mail.linux.com | rev | awk -F. '{ print $1"."$2}'| rev
linux.com

real    0m0.007s
user    0m0.004s
sys     0m0.004s

H_TeXMeX_H · 05-13-2009, 07:23 AM

Oh yeah, I guess I missed that post, and posted the same thing just a second ago. Oh whatever, 0.003 sec is that important.

ghostdog74 · 05-13-2009, 07:28 AM

Quote:

Originally Posted by H_TeXMeX_H

Whichever way you want to do it, there are so many ways.

yes, there are many ways, but don't choose the ones less obvious.

MensaWater · 05-13-2009, 09:10 AM

My second post did it without having to "rev" anything. I just changed the variables to be relative to number of fields rather than explicit 3 and 4.