LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   I need a quote for this please (https://www.linuxquestions.org/questions/linux-general-1/i-need-a-quote-for-this-please-715850/)

shahgols 03-31-2009 01:11 PM

I need a quote for this please
 
Hi everyone,

I have several hundred .html files that have a mistake that I would like to fix. Can someone please help me with the code to do the change in all the files at once? I would hate to spend hours doing the change manually. Thank you.

So, as an example, how would you change the following in all the files in the same directory?

Before:
<a href="http://www.googl.com">Google</a>

After:
<a href="http://www.google.com">Google</a>

Thanks a million.

shahgols 03-31-2009 01:17 PM

I just realized that I put quote in the subject, I meant "code". I really don't want to change hundreds of files manually, I can barely type a correct subject line. :o)

Nylex 03-31-2009 01:23 PM

This may work (you are advised to test it yourself before running it on real files, though):

Code:

for file in `ls *`
do
  sed -i 's/googl/google/' $file
done

It should replace all instances of googl with google in all the files in the output from ls *.

custangro 03-31-2009 02:18 PM

Quote:

Originally Posted by Nylex (Post 3493981)
This may work (you are advised to test it yourself before running it on real files, though):

Code:

for file in `ls *`
do
  sed -i 's/googl/google/' $file
done

It should replace all instances of googl with google in all the files in the output from ls *.

If it's only html files you can...

Code:

for file in $(ls -1 *.html)
do
  sed -i 's/googl/google/' $file
done

-C

shahgols 03-31-2009 02:36 PM

Thanks a lot to both of you. I'll test these out when I get home tonight.

shahgols 03-31-2009 05:00 PM

I have one more request...been searching for a solution to this on the net, but couldn't find a solution that matched my problem exactly.

The problem is that I have a few hundred files in the same directory. I would like to change a part of their names. So I would like to change:

My-Super-File-Name.html

To:

My-Great-File-Name.html

Can you please tell me how this is done? Thanks in advance.

Telemachos 03-31-2009 05:29 PM

Quote:

Originally Posted by shahgols (Post 3494192)
I have one more request...been searching for a solution to this on the net, but couldn't find a solution that matched my problem exactly.

The problem is that I have a few hundred files in the same directory. I would like to change a part of their names. So I would like to change:

My-Super-File-Name.html

To:

My-Great-File-Name.html

Can you please tell me how this is done? Thanks in advance.

If you are using Debian or a Debian derivative, the rename command takes Perl regular expressions and can do this pretty easily:
Code:

rename 's/Super/Great/' *

shahgols 03-31-2009 05:47 PM

wonderful, thank you. I am on Vector Linux and it seems like rename works differently here:

rename Super Great *.html

Thanks again.

shahgols 03-31-2009 05:57 PM

ARRRGHHHH, the original code doesn't work. I guess the example that I gave was not exactly right. Here's the exact code that I want to change inside my file:

<h1><a href="#">The one and only</a></h1>

To:

<h1>The one and only</h1>

I guess the slashes are throwing sed off, because I get an error. How do I do this? Thanks again.

Robhogg 03-31-2009 06:16 PM

Quote:

Originally Posted by shahgols (Post 3494241)
I guess the slashes are throwing sed off, because I get an error. How do I do this? Thanks again.

If you use forward-slashes as the delimiters in sed, and the slashes appear in the pattern as well, you need to escape them with a backslash, but you can use a large range of other delimiters (just choose one you don't need to use in the pattern) - e.g.:

Code:

sed -i 's/\(<h1>\)\(<a[^>]*>\)\([^<]*\)\(<\/a>\)\(<\/h1>\)/\1\3\5/' $file
Or:

Code:

sed -i 's!\(<h1>\)\(<a[^>]*>\)\([^<]*\)\(</a>\)\(</h1>\)!\1\3\5!' $file
Here, exclamation marks are used as the delimiters, so the backslashes in </a> and </h1> are not needed. The \(...\)'s define sub-patterns, that are then replayed by \1 (first sub-pattern), \3 (third sub-pattern), etc.

shahgols 03-31-2009 07:03 PM

Thank you Rob, but to be honest, to me, this is like Chinese mixed in with some English. What should I be studying to learn these stuff? Bash programming?

Anyhow, I ran the following command and got an error.

Code:

root:# for file in 'ls *.html'
> do
> sed -i 's!\(<h1>\)\(<a[^>]*>\)\([^<]*\)\(</a>\)\(</h1>\)!\1\3\5!' $file
> done
sed: can't read ls: No such file or directory

Any idea what I need to do to fix this?

Robhogg 03-31-2009 07:47 PM

I know what you mean - I often refer to these patterns as an explosion in a punctuation factory. They are called regular expressions, and consist of literal characters and metacharacters. The pattern means:
Code:

!            Delimiter beginning search pattern
\(<h1>\)      Group 1: literal <h1>
\(<a[^>]*>\)  Group 2: <a then 0 or more characters that are not >'s, then >
\([^<]*\)    Group 3: Any number of characters that are not <'s
\(</a>\)      Group 4: literal </a>
\(</h1>\)    Group 5: literal </h1>
!            End of search pattern, beginning of replace pattern
\1            Replay group 1
\3            Replay group 3
\5            Replay group 5
!            End of replace pattern

Some stuff on sed and regular expressions from the Linux documentation project.

The reason you got the error message is that you used single quotes - '...' - rather than backticks -`...` around the ls command in the first line - very easy to do. Backticks are on the key to the left of the 1 on a US/UK keyboard, but I would tend to use $(ls *.html) - the $(...) does the same thing as the backticks, but is a lot easier to read.

Edited to add: slight warning about the documentation on regular expressions - there are different forms. For instance, in the link above, it tells you parentheses -- ( ) -- enclose a group. However, in the sed version, escaped parentheses - \( \) - are used.

shahgols 03-31-2009 08:10 PM

Wow, thank you so much Rob for helping me learn this stuff.

I'll try the command again shortly. Right now I got to get some dinner. Peace and thanks again.

custangro 03-31-2009 11:18 PM

Quote:

Originally Posted by shahgols (Post 3494294)
Thank you Rob, but to be honest, to me, this is like Chinese mixed in with some English. What should I be studying to learn these stuff? Bash programming?

Anyhow, I ran the following command and got an error.

Code:

root:# for file in 'ls *.html'
> do
> sed -i 's!\(<h1>\)\(<a[^>]*>\)\([^<]*\)\(</a>\)\(</h1>\)!\1\3\5!' $file
> done
sed: can't read ls: No such file or directory

Any idea what I need to do to fix this?

I see the problem...you put ' instead of `

This is why I always use $(command) instead of `command` it's easier to read (besides...using ` is deprecated...)

Try it this way...
Code:

root:# for file in $(ls *.html)
> do
> sed -i 's!\(<h1>\)\(<a[^>]*>\)\([^<]*\)\(</a>\)\(</h1>\)!\1\3\5!' $file
> done

-C

shahgols 04-01-2009 12:01 PM

Thank you so much everyone, this worked and saved me HOURS of time! So great, thank you!

shahgols 04-01-2009 12:05 PM

Quote:

Originally Posted by Robhogg (Post 3494329)
I know what you mean - I often refer to these patterns as an explosion in a punctuation factory. They are called regular expressions, and consist of literal characters and metacharacters. The pattern means:
Code:

!            Delimiter beginning search pattern
\(<h1>\)      Group 1: literal <h1>
\(<a[^>]*>\)  Group 2: <a then 0 or more characters that are not >'s, then >
\([^<]*\)    Group 3: Any number of characters that are not <'s
\(</a>\)      Group 4: literal </a>
\(</h1>\)    Group 5: literal </h1>
!            End of search pattern, beginning of replace pattern
\1            Replay group 1
\3            Replay group 3
\5            Replay group 5
!            End of replace pattern

Some stuff on sed and regular expressions from the Linux documentation project.

The reason you got the error message is that you used single quotes - '...' - rather than backticks -`...` around the ls command in the first line - very easy to do. Backticks are on the key to the left of the 1 on a US/UK keyboard, but I would tend to use $(ls *.html) - the $(...) does the same thing as the backticks, but is a lot easier to read.

Edited to add: slight warning about the documentation on regular expressions - there are different forms. For instance, in the link above, it tells you prentheses -- ( ) -- enclose a group. However, in the sed version, escaped parentheses - \( \) - are used.

Rob, I have to thank you again for the explanation above, it all makes sense now. Thank you.

chrism01 04-02-2009 07:07 AM

Just a small fyi; each lang that processes 'regexes' uses an internal 'regex engine'. However, each regex engine tends to be different to other regex engines, to a larger or smaller degree.
See for example http://regex.info/
IOW, regex 'incantations' may or may not(!) be transferable... YHBW...

Robhogg 04-02-2009 02:32 PM

Quote:

Originally Posted by shahgols (Post 3494349)
Wow, thank you so much Rob for helping me learn this stuff.

No problem - "glad to be of service!", as a Sirius Cybernetics Corporation door might say :)


All times are GMT -5. The time now is 02:52 AM.