LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 09-21-2007, 12:56 PM   #1
Nimoy
Member
 
Registered: Jun 2003
Location: Currently Denmark
Distribution: Ubuntu 11.10
Posts: 334

Rep: Reputation: 30
SED - minor changes work - Larger doesn't (working and non working code included)


Hi there I am changing strings to other strings in all files in a directory.

Here is an example of what works

CHANGE 1: - The visible copyright notice.

for file in *.html
do
cp $file $file.bak &&
sed 's/Copyright 1999-2005 - All rights reserved/Copyright 1999-2007 - All rights reserved/g' $file.bak >$file
done


CHANGE 2: - The internal copyright notice

(In change 2 I choose # as a delimiter as to keep sed from being confused.)

for file in *.html
do
cp $file $file.bak &&
sed 's#<meta name="copyright" content="Copyright 1999-2005 by Fire Flower Cybernetics. All rights reserved.">#<meta name="copyright" content="Copyright 1999-2007, Fire Flower Cybernetics. All rights reserved.">#g' $file.bak >$file
done

CHANGE 3 - AND THIS IS WHERE THINGS GO WRONG: - The Google ads insertion.

I keep ending up with blank files or files where the ad doesn't show.

If I insert the same text manually it works.... ???? Any ideas

for file in *.html
do
cp $file $file.bak &&
sed 's#</form>Make a difference - Make a donation!<br>

</td>

</tr>

</tbody>

</table>

<br>
#</form>Make a difference - Make a donation!<br>

</td>

</tr>

</tbody>

</table>

<br>
<script type="text/javascript"><!--
google_ad_client = "pub-5045815486985038";
google_ad_width = 728;
google_ad_height = 90;
google_ad_format = "728x90_as";
google_ad_type = "text";
//2007-08-14: globabilityaug2007setup
google_ad_channel = "5631073777";
google_color_border = "000000";
google_color_bg = "FFFFFF";
google_color_link = "0000FF";
google_color_text = "000000";
google_color_url = "008000";
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script><br><br>#"g' $file.bak >$file
done

Thanks in advance!
 
Old 09-21-2007, 01:47 PM   #2
jozyba
Member
 
Registered: Sep 2007
Distribution: Debian Etch, Lenny, Lenny/Sid
Posts: 31

Rep: Reputation: 15
At the end of your third example you've got a stray double-quote after the # delimiter:
Code:
</script><br><br>#"g' $file.bak >$file
..................^
 
Old 09-21-2007, 02:09 PM   #3
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,147

Rep: Reputation: 330Reputation: 330Reputation: 330Reputation: 330
And, just to be pedantic, what's the point of the cp when sed will overwrite the file? A simple mv would be somewhat more efficient.
 
Old 09-21-2007, 04:18 PM   #4
Nimoy
Member
 
Registered: Jun 2003
Location: Currently Denmark
Distribution: Ubuntu 11.10
Posts: 334

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by PTrenholme View Post
And, just to be pedantic, what's the point of the cp when sed will overwrite the file? A simple mv would be somewhat more efficient.
As I have understood it the cp will copy a backup file to .bak so you don't loose the original when sed is overwriting the original html...

Anyhow - Still trouble in paradise:

Won't work: Getting an error message:

sed: -e expression #1, char 46: unterminated `s' command

I'm going slightly mad
 
Old 09-21-2007, 04:27 PM   #5
Nimoy
Member
 
Registered: Jun 2003
Location: Currently Denmark
Distribution: Ubuntu 11.10
Posts: 334

Original Poster
Rep: Reputation: 30
File size

And filesizes are zero....
 
Old 09-21-2007, 04:29 PM   #6
Nimoy
Member
 
Registered: Jun 2003
Location: Currently Denmark
Distribution: Ubuntu 11.10
Posts: 334

Original Poster
Rep: Reputation: 30
I also tried

to throw everything into a textfile.sh and running it via sh textfile.sh - Only change is that the unterminated message gives a +1 higher number...
 
Old 09-21-2007, 05:47 PM   #7
jozyba
Member
 
Registered: Sep 2007
Distribution: Debian Etch, Lenny, Lenny/Sid
Posts: 31

Rep: Reputation: 15
I think you're just using the wrong tool for the job. sed is good for processing files one line at a time; anything more is pushing it beyond what it's designed for. The error message "sed: -e expression #1, char 46: unterminated `s' command" was complaining about the newline character at the end of the first line of your sed script.

I think you need to load the whole file contents into a variable and then search for and replace the substring. Here's a version using bash. It would probably be much faster written in Python, Perl or Ruby:

Code:
#! /bin/bash

substring='</form>Make a difference - Make a donation!<br>

</td>

</tr>

</tbody>

</table>

<br>
'

replacement='</form>Make a difference - Make a donation!<br>

</td>

</tr>

</tbody>

</table>

<br>
<script type="text/javascript"><!--
google_ad_client = "pub-5045815486985038";
google_ad_width = 728;
google_ad_height = 90;
google_ad_format = "728x90_as";
google_ad_type = "text";
//2007-08-14: globabilityaug2007setup
google_ad_channel = "5631073777";
google_color_border = "000000";
google_color_bg = "FFFFFF";
google_color_link = "0000FF";
google_color_text = "000000";
google_color_url = "008000";
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script><br><br>'

for file in *.html; do
    cp $file $file.bak             # alternatively use 'mv'
    file_contents="$(<$file.bak)"
    echo "${file_contents//$substring/$replacement}" >$file
done

Last edited by jozyba; 09-21-2007 at 07:45 PM.
 
Old 09-22-2007, 09:41 AM   #8
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,147

Rep: Reputation: 330Reputation: 330Reputation: 330Reputation: 330
Quote:
Originally Posted by Nimoy View Post
As I have understood it the cp will copy a backup file to .bak so you don't loose the original when sed is overwriting the original html...[snip]
Yes, cp will create a copy of the file, leaving the original file. On the other hand mv will rename the original file without the overhead of creating the copy. Since the original file is to be overwritten, by renaming the original you eliminate that overhead.

As to your problem, look into awk.

[edit]
I was looking at your site's source code, thinking I'd see if I could clobber together a simple awk program for you, and noticed that your html does not pass the W3C standards for html 4. (In fact, in the snippets you showed us, we see <br> instead of the expected standard construct: <br />.)

I also noticed that the "home page" included the "google" code, but that it did not seem to be working.
[/edit]

Last edited by PTrenholme; 09-22-2007 at 10:45 AM.
 
Old 09-22-2007, 11:02 AM   #9
Nimoy
Member
 
Registered: Jun 2003
Location: Currently Denmark
Distribution: Ubuntu 11.10
Posts: 334

Original Poster
Rep: Reputation: 30
Will be testing this in a mo - Keeping you posted

Quote:
Originally Posted by jozyba View Post
I think you're just using the wrong tool for the job. sed is good for processing files one line at a time; anything more is pushing it beyond what it's designed for. The error message "sed: -e expression #1, char 46: unterminated `s' command" was complaining about the newline character at the end of the first line of your sed script.

I think you need to load the whole file contents into a variable and then search for and replace the substring. Here's a version using bash. It would probably be much faster written in Python, Perl or Ruby:

Code:
#! /bin/bash

substring='</form>Make a difference - Make a donation!<br>

</td>

</tr>

</tbody>

</table>

<br>
'

replacement='</form>Make a difference - Make a donation!<br>

</td>

</tr>

</tbody>

</table>

<br>
<script type="text/javascript"><!--
google_ad_client = "pub-5045815486985038";
google_ad_width = 728;
google_ad_height = 90;
google_ad_format = "728x90_as";
google_ad_type = "text";
//2007-08-14: globabilityaug2007setup
google_ad_channel = "5631073777";
google_color_border = "000000";
google_color_bg = "FFFFFF";
google_color_link = "0000FF";
google_color_text = "000000";
google_color_url = "008000";
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script><br><br>'

for file in *.html; do
    cp $file $file.bak             # alternatively use 'mv'
    file_contents="$(<$file.bak)"
    echo "${file_contents//$substring/$replacement}" >$file
done
Thanks for the example!
 
Old 09-22-2007, 11:03 AM   #10
Nimoy
Member
 
Registered: Jun 2003
Location: Currently Denmark
Distribution: Ubuntu 11.10
Posts: 334

Original Poster
Rep: Reputation: 30
True

Quote:
Originally Posted by PTrenholme View Post
Yes, cp will create a copy of the file, leaving the original file. On the other hand mv will rename the original file without the overhead of creating the copy. Since the original file is to be overwritten, by renaming the original you eliminate that overhead.

As to your problem, look into awk.

[edit]
I was looking at your site's source code, thinking I'd see if I could clobber together a simple awk program for you, and noticed that your html does not pass the W3C standards for html 4. (In fact, in the snippets you showed us, we see <br> instead of the expected standard construct: <br />.)

I also noticed that the "home page" included the "google" code, but that it did not seem to be working.
[/edit]
Regarding the W3C Validation - It passed the validator tool last time I tinkered... Something I'll be looking into, thanks for the heads up.

Regarding the google code... odd, had people using both IE and FF test the bits that I had manually inserted.

As for the overhead - yup true, but the notion of having a backup is nice, right now the need for speed is not essential.
 
Old 09-22-2007, 12:45 PM   #11
Nimoy
Member
 
Registered: Jun 2003
Location: Currently Denmark
Distribution: Ubuntu 11.10
Posts: 334

Original Poster
Rep: Reputation: 30
50: Syntax error: Bad substitution

Is the response I get when running the script...

Line 50 is the following:

echo "${file_contents//$substring/$replacement}" >$file

Any ideas as to what might be wrong ?
 
Old 09-22-2007, 02:19 PM   #12
jozyba
Member
 
Registered: Sep 2007
Distribution: Debian Etch, Lenny, Lenny/Sid
Posts: 31

Rep: Reputation: 15
It works perfectly for me using bash 3.1.17. Either you've made some changes to the script above which have introduced a syntax error, or you're not using a bash shell. I see that you're using Ubuntu - you haven't got bash symlinked to something else have you? Try:
Code:
ls -l /bin/sh
ls -l /bin/bash
bash --version
 
Old 09-22-2007, 02:36 PM   #13
Nimoy
Member
 
Registered: Jun 2003
Location: Currently Denmark
Distribution: Ubuntu 11.10
Posts: 334

Original Poster
Rep: Reputation: 30
Results

Quote:
Originally Posted by jozyba View Post
It works perfectly for me using bash 3.1.17. Either you've made some changes to the script above which have introduced a syntax error, or you're not using a bash shell. I see that you're using Ubuntu - you haven't got bash symlinked to something else have you? Try:
Code:
ls -l /bin/sh
ls -l /bin/bash
bash --version
ls -l /bin/sh

gave me

lrwxrwxrwx 1 root root 4 2007-08-02 14:27 /bin/sh -> dash

and

ls -l /bin/bash
-rwxr-xr-x 1 root root 700560 2007-04-11 01:32 /bin/bash

and

bash --version

GNU bash, version 3.2.13(1)-release (i486-pc-linux-gnu)
Copyright (C) 2005 Free Software Foundation, Inc.

Regarding the script I did a copy and paste job from LQ. Saved the file in gedit and sh'ed the script.
 
Old 09-22-2007, 02:37 PM   #14
Nimoy
Member
 
Registered: Jun 2003
Location: Currently Denmark
Distribution: Ubuntu 11.10
Posts: 334

Original Poster
Rep: Reputation: 30
looks like a symlink

anything I can do ?
 
Old 09-22-2007, 02:52 PM   #15
jozyba
Member
 
Registered: Sep 2007
Distribution: Debian Etch, Lenny, Lenny/Sid
Posts: 31

Rep: Reputation: 15
On Ubuntu 'sh' is symlinked to 'dash', so when you call the script by going 'sh myscript.sh' it will ignore the '#!/bin/bash' at the head of the script and use dash instead. dash cannot cope with the syntax in line 50 of your script.

The solution is either to run it with 'bash myscript.sh' or just to use 'chmod u+x myscript.sh' to make it executable, then call it with './myscript.sh'.

Last edited by jozyba; 09-22-2007 at 03:09 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
sed not working... culin Programming 26 02-07-2007 04:58 PM
sed command not working right ncsuapex Linux - General 2 04-22-2006 05:27 PM
[SOLVED] working on files with sed and pipe angel115 Linux - Newbie 4 10-23-2005 06:15 PM
Sed command in file not working lbauer Programming 5 04-06-2005 12:31 PM
sed not working if value is passed thru a variable containg value suchi_s Programming 7 10-29-2004 07:41 AM


All times are GMT -5. The time now is 07:32 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration