basic sed help needed

davimint · 02-10-2008, 02:40 PM

Hi folks,
I hate to admit, but I am just terrible with man and info pages so most
of the time I end up doing stupid stuff so I understand better what I
am doing.

I am wanting to learn more about sed and there is probably a good
tutorial out there for me but I just haven't found it yet. Maybe
because sed is such a great tool with a lot of abilities.

So, here's my stupid stuff and maybe someone can help me out.

I touched a file (something silly)

Code:

 www.any.is_a.jpt

Now, I want to get rid of any "." (simple enough)
or I want to get rid of all "." and "_" (simple enough)

Code:

bash-3.1$ ls
www.any.is_a.jpt
bash-3.1$ echo www.any.is_a.jpt | sed 's/[.]//'
wwwany.is_a.jpt
bash-3.1$ echo www.any.is_a.jpt | sed -e 's/[.]//g' -e 's/[_]//' 
wwwanyisajpt
bash-3.1$

But, what I really wanted to do is just remove the "first and second"
"." from the file so it looks like this "wwwanyisa.jpt"

Thanks in advance.

cmnorton · 02-10-2008, 03:50 PM

I would not be surprised if this problem could be solved using sed. However, if it were my choice, I would look for another tool. sed is very line oriented.

However, it might be possible to construct a regular expression that has everything wildcarded except each "." character. Then, using the substitute command, you might be able to replace each "." with nothing.

Somtimes, it's okay to use something else. awk might let you make every "." into its own field, depending again on how you format the regular expression.

If you search this link

http://www.grymoire.com/Unix/Sed.html#uh-1

for "to keep part of the pattern", your solution might be close.

sed and awk are great. They are the right tools to use for the appropriate job, but don't be afraid to use something else, if you find it more appropriate. Perl was invented after sed and awk for a reason.

I use sed to add columns to a delimited data file that is going to be loaded into a database table; to convert "," delimiters to "|" delimiters; and to remove the last character in a line. sed is perfect in these instances, because it is clean, terse, and makes sense for my use. However, when I need C, awk, or Perl, I use those too.

davimint · 02-10-2008, 04:51 PM

Thanks

I understand what your saying, I just have a hard time understanding
what is best for each situation. Just doing silly things like this
really help me understand and I wanted to get a good understanding
of the older "basic" stuff like bash, sed, & awk, before I tried to
use the more extensive tools.

Anyway, this little exercise did help me and I found that the
following will work although I do agree it's kind of crazy because
I'm actually stripping out the "." and then the next "." to accomplish
what maybe "Perl" would do without evaluating the code twice.

Here is how I accomplished this with sed.

Code:

bash-3.1$ ls
www.any.is_a.jpt
bash-3.1$ echo www.any.is_a.jpt | sed -e 's/[.]//1' -e 's/[.]//1'
wwwanyis_a.jpt
bash-3.1$ echo www.any.is_a.jpt | sed -e 's/[.]//1' -e 's/[.]//1' -e 's/[_]//g'
wwwanyisa.jpt
bash-3.1$

Again, thanks

makyo · 02-10-2008, 08:54 PM

Hi.

Here is one solution that has sed at the core, but relies on other commands as well:

Code:

#!/bin/bash -

# @(#) s1       Demonstrate elimination of characters except last.

echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) rev sed

SOURCE="www.any.is_a.jpt a.b.c.com d.....e.net"

for NAME in $SOURCE
do
  echo
  echo " Converting \"$NAME\" to:"

  echo "$NAME" |
  rev |
  sed "s|[.]||2g" |
  rev
done

exit 0

Producing:

Code:

% ./s1
(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash 2.05b.0
rev (local) - no version provided.
GNU sed version 4.1.2

 Converting "www.any.is_a.jpt" to:
wwwanyis_a.jpt

 Converting "a.b.c.com" to:
abc.com

 Converting "d.....e.net" to:
de.net

See man or info pages for details ... cheers, makyo

ghostdog74 · 02-10-2008, 10:00 PM

Quote:

Originally Posted by davimint

Hi folks,
I hate to admit, but I am just terrible with man and info pages so most
of the time I end up doing stupid stuff so I understand better what I
am doing.

I am wanting to learn more about sed and there is probably a good
tutorial out there for me but I just haven't found it yet. Maybe
because sed is such a great tool with a lot of abilities.

So, here's my stupid stuff and maybe someone can help me out.

I touched a file (something silly)

Code:

 www.any.is_a.jpt

Now, I want to get rid of any "." (simple enough)
or I want to get rid of all "." and "_" (simple enough)

Code:

bash-3.1$ ls
www.any.is_a.jpt
bash-3.1$ echo www.any.is_a.jpt | sed 's/[.]//'
wwwany.is_a.jpt
bash-3.1$ echo www.any.is_a.jpt | sed -e 's/[.]//g' -e 's/[_]//' 
wwwanyisajpt
bash-3.1$

But, what I really wanted to do is just remove the "first and second"
"." from the file so it looks like this "wwwanyisa.jpt"

Thanks in advance.

Code:

# a=www.any.is_a.jpt
# printf "%s.%s\n" `echo ${a%.*} | sed 's/[._]//g'`  ${a##*.} #can use bash's own substitution without sed too
wwwanyisa.jpt

in awk, one way

Code:

# echo $a | awk 'BEGIN{FS="[._]" }{ for(i=1;i<NF;i++) printf $i;print "."$NF}'
wwwanyisa.jpt

cmnorton · 02-11-2008, 08:23 AM

Quote:

Originally Posted by davimint

Thanks

I understand what your saying, I just have a hard time understanding
what is best for each situation. Just doing silly things like this
really help me understand and I wanted to get a good understanding
of the older "basic" stuff like bash, sed, & awk, before I tried to
use the more extensive tools.

Anyway, this little exercise did help me and I found that the
following will work although I do agree it's kind of crazy because
I'm actually stripping out the "." and then the next "." to accomplish
what maybe "Perl" would do without evaluating the code twice.

Here is how I accomplished this with sed.

Code:

bash-3.1$ ls
www.any.is_a.jpt
bash-3.1$ echo www.any.is_a.jpt | sed -e 's/[.]//1' -e 's/[.]//1'
wwwanyis_a.jpt
bash-3.1$ echo www.any.is_a.jpt | sed -e 's/[.]//1' -e 's/[.]//1' -e 's/[_]//g'
wwwanyisa.jpt
bash-3.1$

Again, thanks

Thanks for posting this. I completely forgot you can pass the output of sed back into sed.

radoulov · 02-11-2008, 09:06 AM

Or:

Code:

$ s=www.any.is_a.jpt _s="${s%.*}" s_="${s##*.}"
$ echo "${_s//[_.]}.${s_}"
wwwanyisa.jpt

With zsh:

Code:

% s=www.any.is_a.jpt
% print ${${s%.*}//[_.]}.$s:e
wwwanyisa.jpt