LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Red Hat
User Name
Password
Red Hat This forum is for the discussion of Red Hat Linux.

Notices

Reply
 
Search this Thread
Old 04-01-2005, 11:14 AM   #1
rebel
LQ Newbie
 
Registered: Aug 2003
Distribution: CentOS
Posts: 24

Rep: Reputation: 15
How to "Search & Replace" in html files using Perl?


Hi,

I am a Perl newbie, so please be gentle with me.


I wish to insert <base href=http://www.ab.com> into a html file (say index.html) using a Perl script or Unix command. How do I do that?

Any advice much appreciated.
 
Old 04-01-2005, 02:04 PM   #2
smannell
Member
 
Registered: Feb 2005
Location: Kansas City
Distribution: Kubuntu 8.04
Posts: 72

Rep: Reputation: 15
In Perl, you can open an existing file for reading, or appending; but to the best of my knowledge not for modifying. You'll have to make a new file, write what you want at the beginning, and then add the stuff from the old file. Finally you can rename the files. The nice part of doing it this way is you still have a copy of the original if you break things.

The basic steps are as follows (you'll have to modify the examples for your exact case):

Open a new file for writing
open (OUT, "> $outfile") || die "Cannot create file $outfile: $!";
where $outfile is a variable containing the filename, complete with path

Open the existing file you wish to modify
open (FILE, "$file") || die "Cannot open file $file: $!";
where $file is a variable containg the file name of the file you wish to read from

Write the line you want to the new file
print OUT "some text goes here\n";

Loop through the lines of the old file writing them to the new one
while ($line = <FILE>) {
print OUT "$line\n";
}

Close the file handles
close (FILE) || die "Can't close file $file\n";
close (OUT) || die "Can't close file $outfile\n"

Rename the files
rename ($file, "$file".".bak"); (syntax may be wrong here)
rename ($outfile, $file);

This may not be exactly what you need, and I haven't written a Perl script in a while, so there may be errors; but it will get you started. Also, there are examples of this on the web if you search for filehandles in Perl. Finally, a good perl book is essential if you plan on writting scripts.
 
Old 04-01-2005, 02:07 PM   #3
smannell
Member
 
Registered: Feb 2005
Location: Kansas City
Distribution: Kubuntu 8.04
Posts: 72

Rep: Reputation: 15
I should read more carfully. If you want to use UNIX commands; "cat" will work for this. The man page is fairly straightforward. This will make for a much simpler script than using Perl.
 
Old 04-02-2005, 08:18 AM   #4
rebel
LQ Newbie
 
Registered: Aug 2003
Distribution: CentOS
Posts: 24

Original Poster
Rep: Reputation: 15
Quote:
Originally posted by smannell
In Perl, you can open an existing file for reading, or appending; but to the best of my knowledge not for modifying.
Thanks very much for your response smannell. Very much appreciated.

I toyed with appending and then substitution in a Perl script but to no avail.


With appending, it always adds to the last line in the file. <base href=http://www.abc.com> in the last line is pretty useless.

Your comment that "appending can't modify" finally enlightened me that this is not the way to go.



With substitution, I don't know why but it just didn't work!

For example, I want to replace <head> with <head><base href=http://www.abc.com>
(which in effect adds the base href tag after the head tag)

s/<head>/<head><base href=http:\/\/www.abc.com>/g;

Nothing was changed at all.
 
Old 04-02-2005, 08:25 AM   #5
rebel
LQ Newbie
 
Registered: Aug 2003
Distribution: CentOS
Posts: 24

Original Poster
Rep: Reputation: 15
Quote:
Originally posted by smannell
I should read more carfully. If you want to use UNIX commands; "cat" will work for this. The man page is fairly straightforward. This will make for a much simpler script than using Perl.
For some reason, the "cat and sed" Unix command didn't work.


> cat index.html | sed -e 's/<head>/<head><base href=http:\/\/www.miningnews.net>/'

Last edited by rebel; 04-02-2005 at 08:27 AM.
 
Old 04-02-2005, 08:26 AM   #6
rebel
LQ Newbie
 
Registered: Aug 2003
Distribution: CentOS
Posts: 24

Original Poster
Rep: Reputation: 15
What finally did work is this Unix command:


> perl -pi -e 's/<head>/<head><base href=http:\/\/www.abc.com>/' index.html



Thanks again, smannell!

Last edited by rebel; 04-02-2005 at 08:34 AM.
 
Old 04-06-2005, 05:17 PM   #7
ivanatora
Member
 
Registered: Sep 2003
Location: Bulgaria
Distribution: Ubuntu 9.10, FreeBSD 7.2
Posts: 459

Rep: Reputation: 31
I see the thread is finished, but here is the easiest (imo) way
Code:
open (IN,$file) or die "$!";      #<- open the file
@in = <IN>;                           #<- read it into an array
close(IN);                                #<- we don't need it anymore, so close it
foreach $line (@in){                 #for each element of the array
$line =~ s|<head>|<head><base href=http://www.abv.com/|;   #<- do the substitution, notice that you can use other letters except '/' for the s/// delimiters, in that case you don't need to escape the /'s later ;)
}       
open (OUT,"> $file") or die "$!";  #<- open the file for writing
print OUT @in;                                #<- dump the modified content into it
close(OUT);                                      #<- and close it
It's no difficult And you don't have to mix bash commands, and perl functions and other stuff. Hope I helped

Last edited by ivanatora; 04-06-2005 at 05:18 PM.
 
Old 04-07-2005, 08:11 PM   #8
rebel
LQ Newbie
 
Registered: Aug 2003
Distribution: CentOS
Posts: 24

Original Poster
Rep: Reputation: 15
Hi ivantora,

Thanks for your response.

Do we leave IN and OUT in the perl script, or replace them with the filenames? How do we run the perl script if we use IN and OUT?
 
Old 04-09-2005, 12:58 PM   #9
ivanatora
Member
 
Registered: Sep 2003
Location: Bulgaria
Distribution: Ubuntu 9.10, FreeBSD 7.2
Posts: 459

Rep: Reputation: 31
IN and OUT are just filehandles. The thing you need is to set $file before the other lines (forgot to do that ). Like
$file = "/home/ivanatora/bleh/index.html";
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Telling people to use "Google," to "RTFM," or "Use the search feature" Ausar General 77 03-21-2010 11:26 AM
How to write a bash script to replace all "KH" to "K" in file ABC??? cqmyg5 Slackware 4 07-24-2007 09:00 AM
"Search for files" missing from main menu TomF Linux - Newbie 1 01-03-2005 11:08 AM
Gnome "Search for Files..." not into hidden directory max74 Linux - Software 2 09-11-2003 06:53 PM
problem in perl replace command with slash (/) in search/replace string ramesh_ps1 Red Hat 4 09-10-2003 01:04 AM


All times are GMT -5. The time now is 09:53 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration