-   Linux - Newbie (
-   -   I need help searching for values in a file. (

jim.thornton 10-12-2012 11:38 AM

I need help searching for values in a file.
I am in the process of migrating a site. The site was running Joomla 1.0.x and there were about 84 pages of content. The site had defined a custom HTML Title for each page and it was saved into a db with the following structure:


INSERT INTO `jos_content` (`title`, `attribs`) VALUES
The `attribs` field contains a whole bunch of parameters int he format like this:


The attribute I'm looking for is: html_title

I have created the regex that will find the value that I want out of each entry:

However, I can't figure out how to parse the data. Currently I have exported those two fields only into a .sql file. So I don't currently have it in a mysql database.

Could someone please help me come up with some code that would create a text file with a list as follows:
title: html_title (with html_title being the only value extracted out of attribs)

I would appreciate any help I can get please.

sinu_nayak2001 10-12-2012 03:08 PM

Have you tried 'cut' command? or awk?

jim.thornton 10-12-2012 03:54 PM

there are 84 instances of the title field and the attribs field.

I'm not sure how to create a script that will read the file and loop through each instance and then extract what I need.

grail 10-13-2012 12:19 AM

Hey Jim ... could you supply an example with a few lines and also have them include the required data (unlike your current example)?

This does seem like a fairly easy task but we need to better understand the data if we are to assist with a solution.

jim.thornton 10-13-2012 01:19 AM

Thanks for the reply. Don't worry about it. Usually my questions get answered in a few minutes because of the amount of traffic on this forum so when I went a lot of the day without a response, I figured one wasn't coming.

As a result I ended up doing it semi-manually. I opened the SQL file in a text editor that supports regular expressions. I then created a regex to find everything before html_title and replaced it with nothing (essentially deleting it) and then I created a regex to find everything after html_title and replaced it with nothing. I then just copy/pasted the results to where I needed them.

grail 10-13-2012 03:50 AM

Well I am not sure about others, but as the replies prior to mine were 0400 you might see why some of us had not seen the question yet :)

If you are happy with your solution then by all means marks as SOLVED, but should you like an alternative for the future I would still like to see some data ;)

jim.thornton 10-13-2012 07:39 AM

not sure what you mean by by 0400.???

As for the solution, I'll just mark it solved because I think the way I did it was easier than writing a script in the end anyway.

grail 10-13-2012 09:09 AM

0400 -- 4 am. in the morning :)

jim.thornton 10-13-2012 05:59 PM

Thanks... I thought you were talking permissions or something. As for the time, no worries I wasn't complaining. It's just that most of my questions are answered within a hour but this one was about 8 or 9 hours. Hey... I'm happy to get any help, so I'm not complaining at all.

All times are GMT -5. The time now is 05:50 AM.