LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 02-25-2008, 11:46 AM   #31
jettachamp26
Member
 
Registered: Feb 2008
Location: Florida
Distribution: ubuntu
Posts: 30

Original Poster
Rep: Reputation: 15

That font check did the trick. I'm going to leave out the ^ in case there happens to be a font tag stuck somewhere in the middle of the file.


As for the redundant anchor tags, I might just have to deal with it and go back later and clean it up manually. It doesn't hurt my code... just doesn't need to be there/ is sloppy.

thanks again!
 
Old 02-25-2008, 12:59 PM   #32
jettachamp26
Member
 
Registered: Feb 2008
Location: Florida
Distribution: ubuntu
Posts: 30

Original Poster
Rep: Reputation: 15
found another little gremlin in my code. I have a problem when I remove <SPAN></SPAN> tags, if there is some other tag in between, its getting removed as well.

Example code:
Code:
<P STYLE="font-weight: medium"><SPAN STYLE="font-style: normal"><SPAN STYLE="font-weight: medium">After</SPAN></SPAN><I><SPAN STYLE="font-weight: medium">
The Lost Colony, </SPAN></I>
my replacement commands:

Code:
sed -i 's/<I>/<em>/g' "$f"
sed -i 's/<\/I>/<\/em>/g' "$f"
sed -i 's/<SPAN STYLE=".*">//g' "$f"
sed -i 's/<\/SPAN>//g' "$f"

**EDIT**

Its hard to see, but what I mean is in the first code set, the word After gets removed when the span tags are removed. The <I> tag gets removed as well.

**EDIT**

Last edited by jettachamp26; 02-25-2008 at 01:13 PM.
 
Old 02-25-2008, 01:38 PM   #33
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2374Reputation: 2374Reputation: 2374Reputation: 2374Reputation: 2374Reputation: 2374Reputation: 2374Reputation: 2374Reputation: 2374Reputation: 2374Reputation: 2374
Hi,

This part: <SPAN STYLE=".*"> is not unique enough. Looking at the example line I doubt if this could be done with a 'simple' reg-exp, there's not enough uniqueness to create one that isn't gready.
 
Old 02-25-2008, 01:47 PM   #34
jettachamp26
Member
 
Registered: Feb 2008
Location: Florida
Distribution: ubuntu
Posts: 30

Original Poster
Rep: Reputation: 15
looks like the <P tag in the example i greedy too. Man this stinks.
 
Old 02-25-2008, 02:53 PM   #35
jettachamp26
Member
 
Registered: Feb 2008
Location: Florida
Distribution: ubuntu
Posts: 30

Original Poster
Rep: Reputation: 15
figured out a solution to the problem. Both for SPAN and P to stop them from being so stinking greedy.

sed -i 's/<P[^>]*>/<P>/g' "$f"
sed -i 's/<SPAN[^>]*>//g' "$f"

by putting the [^>] in, it searches for everything but the > making it stop right at the end of its own tag, instead of later down the line.
 
Old 02-25-2008, 07:00 PM   #36
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.6, Centos 5.10
Posts: 16,324

Rep: Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041
Perl XML recommendations: http://www.perlfoundation.org/perl5/...ed_xml_modules
See also : http://www.perlfoundation.org/perl5/...d_cpan_modules
and the motherlode is here: search.cpan.org
For highly informed opinions, see the gurus at www.perlmonks.org
 
  


Reply

Tags
replace, sed


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Bash script text replacement... matthurne Programming 4 06-07-2011 07:46 PM
Help with BASH to search text files on disk purveshk Linux - Newbie 3 02-19-2008 02:14 PM
how to change some text of a certain line of a text file with bash and *nix scripting alred Programming 6 07-10-2006 12:55 PM
Bash scripting to check text in a website carlp Programming 2 09-20-2005 12:14 PM
Recursive search in bash scripting ! zulfilee Linux - Software 3 12-12-2004 11:40 PM


All times are GMT -5. The time now is 11:41 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration