LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 11-24-2009, 11:28 AM   #1
ludo33
Member
 
Registered: Feb 2009
Posts: 119

Rep: Reputation: 16
Extract Data using CURL


Hello folks,

I have a list of terms in a mysql database, I need to extract the page contents from wikipedia for each term and write the result to the mysql table. I've been informed that CURL is the way to go about this, this is all new to me, can anybody suggest any resources to help with this?

Thanks in advance.
 
Old 11-25-2009, 03:59 AM   #2
centosboy
Senior Member
 
Registered: May 2009
Location: london
Distribution: centos5
Posts: 1,137

Rep: Reputation: 116Reputation: 116
Quote:
Originally Posted by ludo33 View Post
Hello folks,

I have a list of terms in a mysql database, I need to extract the page contents from wikipedia for each term and write the result to the mysql table. I've been informed that CURL is the way to go about this, this is all new to me, can anybody suggest any resources to help with this?

Thanks in advance.
because the wiki page search does a redirect on terms searched in certain case, you would need something like

Code:
curl -L http://en.wikipedia.org/wiki/<string>
this would output the contents of this page to the terminal, so you would need to direct it somewhere.

Last edited by centosboy; 11-25-2009 at 04:00 AM.
 
Old 11-25-2009, 06:11 AM   #3
ludo33
Member
 
Registered: Feb 2009
Posts: 119

Original Poster
Rep: Reputation: 16
Thanks for that

Quote:
Originally Posted by centosboy View Post
this would output the contents of this page to the terminal, so you would need to direct it somewhere.
This is the bit that I am stuck at, how to write the result to a mysql table?
 
Old 11-25-2009, 09:27 AM   #4
centosboy
Senior Member
 
Registered: May 2009
Location: london
Distribution: centos5
Posts: 1,137

Rep: Reputation: 116Reputation: 116
Quote:
Originally Posted by ludo33 View Post
This is the bit that I am stuck at, how to write the result to a mysql table?
made a slight mistake here...

what i should have said is something like this:

use mysqlimport to import the data back into a database.

if for example the file dumped from curl is called madonna.txt, then mysqlimport would import it into a table (assuming it is already created) called madonna.

the madonna text file would need some sort of parsing before hand (tab separation).

assuming this has all been done, then this could work.

Code:
curl -L http://en.wikipedia.org/wiki/madonna > madonna.txt
some file parsing code here.............(tab separated columns)
mysqlimport -h host -pxxx dbname --local madonna.txt

Last edited by centosboy; 11-25-2009 at 09:52 AM.
 
Old 11-29-2009, 02:17 AM   #5
ludo33
Member
 
Registered: Feb 2009
Posts: 119

Original Poster
Rep: Reputation: 16
Many

Thanks for your help, all seems to be working fine.

Again thanks for yout informed reply!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
extract data from link jindalarpan Linux - Software 5 09-21-2009 08:50 PM
cURL: Server has many IPs, how would I make a cURL script use those IPs to send data? guest Programming 0 04-11-2009 11:42 AM
PHP: cUrl - post data doesn't go through proxy elvijs Programming 0 12-17-2007 08:26 AM
cURL not posting login data? Travesser Linux - General 1 02-15-2007 01:14 PM
Extract data ust Linux - General 1 10-23-2003 05:45 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 06:05 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration