I need a quote for this please

shahgols · 03-31-2009, 01:11 PM

Hi everyone,

I have several hundred .html files that have a mistake that I would like to fix. Can someone please help me with the code to do the change in all the files at once? I would hate to spend hours doing the change manually. Thank you.

So, as an example, how would you change the following in all the files in the same directory?

Before:
<a href="http://www.googl.com">Google</a>

After:
<a href="http://www.google.com">Google</a>

Thanks a million.

shahgols · 03-31-2009, 01:17 PM

I just realized that I put quote in the subject, I meant "code". I really don't want to change hundreds of files manually, I can barely type a correct subject line.

)

Nylex · 03-31-2009, 01:23 PM

This may work (you are advised to test it yourself before running it on real files, though):

Code:

for file in `ls *`
do
  sed -i 's/googl/google/' $file
done

It should replace all instances of googl with google in all the files in the output from ls *.

custangro · 03-31-2009, 02:18 PM

Quote:

Originally Posted by Nylex

This may work (you are advised to test it yourself before running it on real files, though):

Code:

for file in `ls *`
do
  sed -i 's/googl/google/' $file
done

It should replace all instances of googl with google in all the files in the output from ls *.

If it's only html files you can...

Code:

for file in $(ls -1 *.html)
do
  sed -i 's/googl/google/' $file
done

-C

shahgols · 03-31-2009, 02:36 PM

Thanks a lot to both of you. I'll test these out when I get home tonight.

shahgols · 03-31-2009, 05:00 PM

I have one more request...been searching for a solution to this on the net, but couldn't find a solution that matched my problem exactly.

The problem is that I have a few hundred files in the same directory. I would like to change a part of their names. So I would like to change:

My-Super-File-Name.html

To:

My-Great-File-Name.html

Can you please tell me how this is done? Thanks in advance.

Telemachos · 03-31-2009, 05:29 PM

Quote:

Originally Posted by shahgols

I have one more request...been searching for a solution to this on the net, but couldn't find a solution that matched my problem exactly.

The problem is that I have a few hundred files in the same directory. I would like to change a part of their names. So I would like to change:

My-Super-File-Name.html

To:

My-Great-File-Name.html

Can you please tell me how this is done? Thanks in advance.

If you are using Debian or a Debian derivative, the rename command takes Perl regular expressions and can do this pretty easily:

Code:

rename 's/Super/Great/' *

shahgols · 03-31-2009, 05:47 PM

wonderful, thank you. I am on Vector Linux and it seems like rename works differently here:

rename Super Great *.html

Thanks again.

shahgols · 03-31-2009, 05:57 PM

ARRRGHHHH, the original code doesn't work. I guess the example that I gave was not exactly right. Here's the exact code that I want to change inside my file:

<h1><a href="#">The one and only</a></h1>

To:

<h1>The one and only</h1>

I guess the slashes are throwing sed off, because I get an error. How do I do this? Thanks again.

Robhogg · 03-31-2009, 06:16 PM

Quote:

Originally Posted by shahgols

I guess the slashes are throwing sed off, because I get an error. How do I do this? Thanks again.

If you use forward-slashes as the delimiters in sed, and the slashes appear in the pattern as well, you need to escape them with a backslash, but you can use a large range of other delimiters (just choose one you don't need to use in the pattern) - e.g.:

Code:

sed -i 's/\(<h1>\)\(<a[^>]*>\)\([^<]*\)\(<\/a>\)\(<\/h1>\)/\1\3\5/' $file

Or:

Code:

sed -i 's!\(<h1>\)\(<a[^>]*>\)\([^<]*\)\(</a>\)\(</h1>\)!\1\3\5!' $file

Here, exclamation marks are used as the delimiters, so the backslashes in </a> and </h1> are not needed. The $...$'s define sub-patterns, that are then replayed by \1 (first sub-pattern), \3 (third sub-pattern), etc.

shahgols · 03-31-2009, 07:03 PM

Thank you Rob, but to be honest, to me, this is like Chinese mixed in with some English. What should I be studying to learn these stuff? Bash programming?

Anyhow, I ran the following command and got an error.

Code:

root:# for file in 'ls *.html'
> do
> sed -i 's!\(<h1>\)\(<a[^>]*>\)\([^<]*\)\(</a>\)\(</h1>\)!\1\3\5!' $file
> done
sed: can't read ls: No such file or directory

Any idea what I need to do to fix this?

Robhogg · 03-31-2009, 07:47 PM

I know what you mean - I often refer to these patterns as an explosion in a punctuation factory. They are called regular expressions, and consist of literal characters and metacharacters. The pattern means:

Code:

!             Delimiter beginning search pattern
\(<h1>\)      Group 1: literal <h1>
\(<a[^>]*>\)  Group 2: <a then 0 or more characters that are not >'s, then >
\([^<]*\)     Group 3: Any number of characters that are not <'s
\(</a>\)      Group 4: literal </a>
\(</h1>\)     Group 5: literal </h1>
!             End of search pattern, beginning of replace pattern
\1            Replay group 1
\3            Replay group 3
\5            Replay group 5
!             End of replace pattern

Some stuff on sed and regular expressions from the Linux documentation project.

The reason you got the error message is that you used single quotes - '...' - rather than backticks -`...` around the ls command in the first line - very easy to do. Backticks are on the key to the left of the 1 on a US/UK keyboard, but I would tend to use $(ls *.html) - the $(...) does the same thing as the backticks, but is a lot easier to read.

Edited to add: slight warning about the documentation on regular expressions - there are different forms. For instance, in the link above, it tells you parentheses -- ( ) -- enclose a group. However, in the sed version, escaped parentheses -  - are used.

shahgols · 03-31-2009, 08:10 PM

Wow, thank you so much Rob for helping me learn this stuff.

I'll try the command again shortly. Right now I got to get some dinner. Peace and thanks again.

custangro · 03-31-2009, 11:18 PM

Quote:

Originally Posted by shahgols

Thank you Rob, but to be honest, to me, this is like Chinese mixed in with some English. What should I be studying to learn these stuff? Bash programming?

Anyhow, I ran the following command and got an error.

Code:

root:# for file in 'ls *.html'
> do
> sed -i 's!\(<h1>\)\(<a[^>]*>\)\([^<]*\)\(</a>\)\(</h1>\)!\1\3\5!' $file
> done
sed: can't read ls: No such file or directory

Any idea what I need to do to fix this?

I see the problem...you put ' instead of `

This is why I always use $(command) instead of `command` it's easier to read (besides...using ` is deprecated...)

Try it this way...

Code:

root:# for file in $(ls *.html)
> do
> sed -i 's!\(<h1>\)\(<a[^>]*>\)\([^<]*\)\(</a>\)\(</h1>\)!\1\3\5!' $file
> done

-C

shahgols · 04-01-2009, 12:01 PM

Thank you so much everyone, this worked and saved me HOURS of time! So great, thank you!