LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-08-2011, 06:42 PM   #1
TheCrow33
Member
 
Registered: Aug 2009
Posts: 81

Rep: Reputation: 8
Regex help


I'm currently working on a website for my employer, and trying to use a tool called Rereplacer to make my life easier. Anyway the idea is I'm going to create one article with a few keywords like PRODUCTNAME that will be replaced based on the value held in the <title> tags.

Rereplacer just uses regular expressions to replace given strings with other strings. Anyway I'm fairly certain I can take just the text within the <title> tags and replace the word PRODUCTNAME with the result, anyone know how it could be done?

Just for simplicity sake let's assume this is the format of the answer I'm looking for:

To replace: "regexReplaceStr"
Replace with: "replacementStr"

Thanks for any help.

P.S. I wouldn't normally ask such a skiddie question, but my boss is quite impatient and if I didn't I'd be looking for a new job.
 
Old 01-08-2011, 07:48 PM   #2
AlucardZero
Senior Member
 
Registered: May 2006
Location: USA
Distribution: Debian
Posts: 4,824

Rep: Reputation: 615Reputation: 615Reputation: 615Reputation: 615Reputation: 615Reputation: 615
This doesn't need a regular expression.. and I have no clue how Rereplacer is expecting input.

If I was using sed it would be: sed "s/<title>regexReplaceStr/<title>replacementStr/g"

Also not sure what you mean by skiddie (script kiddie) .. you DDoSing someone?
 
Old 01-08-2011, 08:16 PM   #3
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948
First match on Google search for rereplacer is the Joomla ReReplacer description, complete with links to examples and regular expression cheatsheets; there's even a link to the forum, with a separate category for ReReplacer. Is this the software you're using? If so, why did you ask here?

There's even the inverse problem (changing title based on text in a div) in a post the developer responded to. Adapted for your question, the search pattern would be
Code:
(<title>)(.*)(</title>.*)PRODUCTNAME
and the replacement
Code:
\1\2\3\2
with 'Search area' set to 'Everywhere'. This should replace PRODUCTNAME everywhere with whatever you have in your "title" element.
Nominal Animal

Last edited by Nominal Animal; 03-21-2011 at 01:42 AM.
 
Old 01-09-2011, 09:44 AM   #4
TheCrow33
Member
 
Registered: Aug 2009
Posts: 81

Original Poster
Rep: Reputation: 8
Thank you nomial animal. I had indeed found that post on the rereplacer website, which is the only way I knew this could be done with a regular expression. The only reason I did not post on their website is because it seemed to me that just to sign up for their forums costs money (They ask for a "billing address" in the registration process). And I figured you guys would be quicker to respond, and free. Anyway after I found that post I tried to manipulate that regex to do what I wanted, but like I said I don't have much knowledge of regular expressions and couldn't succesfully do it. And quite frankly that regex confused the hell out of me.

AlcardZero: no I'm not DDoSing someone, and quite frankly that seems a stupid question. The original question was skiddie (yes, script kiddie) like because I was a asking for someone else to do the work and hand me an answer. I also did not expect you to know how rereplacer expected input, that is why I gave a format of how I expected an answer.
 
Old 01-09-2011, 11:21 AM   #5
AlucardZero
Senior Member
 
Registered: May 2006
Location: USA
Distribution: Debian
Posts: 4,824

Rep: Reputation: 615Reputation: 615Reputation: 615Reputation: 615Reputation: 615Reputation: 615
We have a different definition of skiddie then.
 
Old 01-09-2011, 01:25 PM   #6
TheCrow33
Member
 
Registered: Aug 2009
Posts: 81

Original Poster
Rep: Reputation: 8
Quote:
Originally Posted by AlucardZero View Post
We have a different definition of skiddie then.
I don't think it's our definitions that differ, but rather I was referring to that question being skiddie'ish in nature. I would expect any skiddie to ask for an answer handed to them on a silver platter with no explanation of how it works (as I did in this post), and without doing any actual work to get a working solution. That's all I was referring to.
 
Old 01-09-2011, 01:59 PM   #7
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948
Quote:
Originally Posted by TheCrow33 View Post
Anyway after I found that post I tried to manipulate that regex to do what I wanted, but like I said I don't have much knowledge of regular expressions and couldn't succesfully do it. And quite frankly that regex confused the hell out of me.
Well, it would have saved time if you had told us that. Let me explain how it works, by splitting into pieces. But first, some tips:
  • Characters listed in square brackers [ ] are alternatives. Ranges like A-Z are supported. If the first character is a caret, ^, the set is inverted: any character except the listed match.
  • Period . matches any character.
  • Asterisk * means any number (zero or more) of the preceding character or subexpression.
  • Question mark ? means zero or one of the preceding character or subexpression.
  • Parentheses () define a subexpression which can be referred to in the replacement. First subexpression is referred to as \1, second \2 and so on.
  • Subexpressions can be nested, but only the outermost ones can be referred to.
  • Within subexpressions, vertical bar | separates alternative subexpressions. For example, (abc|def) matches either abc or def.
Lets examine the expression (<title>)(.*)(</title>.*)PRODUCTNAME:
  • The first subexpression, (<title>) starts the match with the title tag: <title>. Since it's also the first subexpression in parentheses, we can refer to it in the replacement as \1.
    If you wish to be careful, you can use (<[Tt][Ii][Tt][Ll][Ee](>|[\n\t\v\f\r ][^>]*>)) instead, to match uppercase title tags and title tags with attributes.
    (Note the inner subexpression: it will match either an immediate >, or a whitespace followed by anything up to the first >.)
  • The second subexpression, (.*) matches anything (or nothing). It is a greedy match, so it will contain anything up to but not including the last match of the following subexpression. If there were no following subexpression or characters, it would match till the end of the document.
  • The third subexpression, (</title>.*) is the tricky one. Not only does it match the title end tag, but also everything up to but not including the last match of the following subexpression. (Again, you might wish to use (</[Tt][Ii][Tt][Ll][Ee]>.*) instead.)
  • Finally, there is the target identifier, which we wish to replace: PRODUCTNAME .
The way this works is quite simple. The match starts at the beginning of the title element, and contains everything up to the end of the string to be replaced. (Specifically, the third subexpression will contain most of your HTML document.)
ReReplacer will replace any matches with \1\2\3\2, which means the title start tag \1, title \2, title end tag and most of the HTML document up to but not including the replacement string \3, followed by the title again \2.

I don't know if ReReplacer applies this repeatedly or not. If you find that only the last occurrence of PRODUCTNAME is replaced with the title text, you simply need to copy this multiple times (to apply it multiple times), once for each possible occurrence of PRODUCTNAME.

Note that PRODUCTNAME itself should never occur in the title. If it does, you'll get rather interesting but unwanted results.

Hope this helps,
Nominal Animal

Last edited by Nominal Animal; 03-21-2011 at 02:00 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Perl to find regex and print following 5 lines after regex casperdaghost Linux - Newbie 3 08-29-2010 08:08 PM
Help with regex tbeehler Linux - Software 4 07-11-2008 10:05 AM
regex with sed to process file, need help on regex dwynter Linux - Newbie 5 08-31-2007 05:10 AM
regex help siyisoy Programming 4 04-07-2006 05:32 AM
Regex Help cmfarley19 Programming 5 03-31-2005 10:13 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:54 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration