LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-12-2003, 09:12 PM   #1
vexer
Member
 
Registered: Jan 2003
Location: Sudbury Ontario, Canada
Distribution: Slackware
Posts: 388

Rep: Reputation: 30
Search Bots


These days, I'm too lazy to look for stuff on google and what not

and what I wanted to know is how I would go about to creating

a Search engine search bot... basicly capturing all the links from

different search engines that contain key words that I specify.


Anyone have any thoughts or possibly code examples?



(PS: This will in no way used for malicious purpuses such as flooding or any other type of script kiddie activities. I'm pro grc.com)


-vex
 
Old 01-13-2003, 09:21 AM   #2
Mik
Senior Member
 
Registered: Dec 2001
Location: The Netherlands
Distribution: Ubuntu
Posts: 1,316

Rep: Reputation: 47
Retrieving the links returned by the search engine would probably be very easy. Most search engines allow you to specify one line with the search term, like:
http://www.google.com/linux?q=word1+word2
All you have to do then is parse the output. A simple script with wget and egrep could probably do most of that.

Your biggest problem would be to have some kind of rating system which determines which links are usefull and which aren't. Just because the page has the same word you where searching for doesn't mean the page is usefull to you at all. Just having a bot which returns 10,000 links from 10 search engines isn't gonna reduce the work you have to do if you have to go through each link and see which one happens to be usefull.
 
Old 01-13-2003, 03:20 PM   #3
lackluster
Member
 
Registered: Apr 2002
Location: D.C - USA
Distribution: slackware-current
Posts: 488

Rep: Reputation: 30
look at search.cpan.org and search for search or WWW::Search or just WWW. You'll find it.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
worm.linuxday.com.br IRC bots? tek1024 Linux - Security 1 02-20-2005 12:43 AM
bots maybe a possible hacker?? nepcw Linux - Security 3 10-04-2004 05:41 AM
stopping bots from getting to my apache server Lleb_KCir Linux - Software 2 06-15-2004 01:48 PM
I suck at Freeciv - bots maybe? dushkinup Linux - Games 2 04-04-2004 06:15 PM
is there any bots for rtcw? LavaDevil94 Linux - Games 6 10-31-2003 02:26 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 06:11 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration