LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-10-2009, 05:57 AM   #1
acider
LQ Newbie
 
Registered: Nov 2009
Posts: 3

Rep: Reputation: 0
how to replace lots of text


Hi everyone, I am new to Linux and unable to solve a problem. I have a html file and need to replace its text regularly using cronjob.

If example, the html file has

<font face=arial>Just a test.</font>
<font face=arial>This is a test.</font>
<font face=arial>This is just a test.</font>
<font face=arial>This is also a test.</font>
<font face=arial>This is only a test.</font>

I would require it to be replaced to the following regularly

<font face=arial>just a test.</font>
<font face=arial>this is a test.</font>
<font face=arial><b>this is just a test.</b></font>
<font face=arial><b>this is also a test.</b></font>
<font face=arial>this is only a test.</font>

What would be the easiest way to have a shell script replace everything from <font face=arial>J*only a test.</font> to a new chunk of text?

I was looking at sed examples but it seems to be good at changing only a few words. I am concerned about the file having special characters too, ie. <>?$& and not sure if sed could do the words + characters replacement well.

Pls help...
 
Old 11-10-2009, 07:03 AM   #2
CmdoColin
Member
 
Registered: Jul 2009
Posts: 31

Rep: Reputation: 17
sed is a stream editior, could possibly use that to modify something. You can use multiples in a line, for example:

A file called "stuff.txt" with:

I have two CD's for music
I have one CD for games
I have three CD's for a new distro

sed 's/CD/DVD/g' -e 's/distro/monkey/g' stuff.txt

Will replace the Word CD with DVD and distro with monkey, so afterwards the file "stuff.txt" becomes:

I have two DVD's for music
I have one DVD for games
I have three DVD's for a new monkey

You can also looks for only lines containing something:

sed -e '/three/s/new/naughty/g' stuff.txt

I have two CD's for music
I have one CD for games
I have three CD's for a naughty distro

Or you can put the lot together for:

sed -e '/three/s/new/naughty/g' -e 's/CD/DVD/g' -e 's/distro/monkey/g' stuff.txt

I have two DVD's for music
I have one DVD for games
I have three DVD's for a naughty monkey

Depending on what you want to achieve, could this help? Sed is pretty powerful, and there's a lot more too it depending on what you want to achieve. It is non-interactive, awk is possibly something else to use, it's a little more complex though. Definitely two things to look at.

Last edited by CmdoColin; 11-10-2009 at 07:19 AM. Reason: sorry mong moment - just did a basic sed example
 
Old 11-10-2009, 04:30 PM   #3
themanwhowas
Member
 
Registered: Nov 2005
Distribution: CentOS 5, Fedora 23
Posts: 216

Rep: Reputation: 29
or you can use perls file handler abilities to take line x, do whatever to it then recreate the whole file using the modified line(s). Sed is probably the way to go but it's a bit complicated
 
Old 11-10-2009, 08:05 PM   #4
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 7,518

Rep: Reputation: 2390Reputation: 2390Reputation: 2390Reputation: 2390Reputation: 2390Reputation: 2390Reputation: 2390Reputation: 2390Reputation: 2390Reputation: 2390Reputation: 2390
There are several "power tools" which can be applied here.
  1. sed is the "stream editor" which accepts a single stream of input, applies some filter or edit to it, and passes the result as its stream of output.
  2. awk is a tool that is designed for processing more complicated files. Here you have a series of conditions that awk is to test for, along with rules that are to be applied to those records which meet each condition.
  3. Perl is a full-fledged programming language that is especially suited to text manipulation. (And, well, just about anything and everything else that you could possibly think of...)
 
Old 11-10-2009, 09:18 PM   #5
acider
LQ Newbie
 
Registered: Nov 2009
Posts: 3

Original Poster
Rep: Reputation: 0
Thanks for the replies.

I guess the sed examples won't work. It becomes too difficult to edit because there are a lot of text.

Ideally the perl or shell script should do the following:

1) check example.html file and look for this part

<html>*</h1>

the asterix means include everything that appears from <html> to </h1>, it does not matter what text is inside it, ultimately this whole part will be changed to something else, getting it to recognise the correct part (from <html> to </h1>) is more important

2) replace the above part into something else for example

<html>
<head>
<title>this is new title</title>
<script></script>
</head>
<body>
<h1>this is new text and is only a test</h1>

3) Text information in example.html is updated with the new text information from <html> to </h1>

Getting sed to look for lines won't work because the lines in example.html keep changing. As long as it could replace everything from <html> to </h1>, job is done.

What is the best way to tackle the above?

Last edited by acider; 11-10-2009 at 09:22 PM.
 
Old 11-11-2009, 12:38 AM   #6
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,260

Rep: Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328
I'd definitely go with Perl for that.
http://perldoc.perl.org/
http://www.perlmonks.org/?node=Tutorials
 
Old 11-11-2009, 12:42 AM   #7
acider
LQ Newbie
 
Registered: Nov 2009
Posts: 3

Original Poster
Rep: Reputation: 0
Wow thats too much for me to understand.

All I need is an example of a working script.
 
Old 11-11-2009, 12:51 AM   #8
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Code:
awk -F">" '{
  s=$1">" tolower(substr($2,1,1)) substr($2,2)">"
  print s
}' file
output
Code:
$ ./shell.sh
<font face=arial>just a test.</font>
<font face=arial>this is a test.</font>
<font face=arial>this is just a test.</font>
<font face=arial>this is also a test.</font>
<font face=arial>this is only a test.</font>
you did not say how the <B> comes about...so i will leave it as that.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting lots of text files to Unicode Schreiberling Linux - Software 11 06-11-2013 04:24 PM
using sed to replace text on one line in a text file vo1pwf Linux - Newbie 5 06-24-2009 08:54 AM
bash script to create text in a file or replace value of text if already exists knightto Linux - Newbie 5 09-11-2008 12:13 AM
Replace text of unknown content with other text in file brian0918 Programming 15 07-14-2005 10:22 PM
Replace text of unknown content with other text in file brian0918 Linux - Software 1 07-14-2005 04:22 PM


All times are GMT -5. The time now is 01:40 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration