Download your favorite Linux distribution at LQ ISO.
Go Back > Forums > Non-*NIX Forums > Programming
User Name
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.


  Search this Thread
Old 10-27-2004, 10:48 PM   #1
LQ Newbie
Registered: Oct 2004
Posts: 1

Rep: Reputation: 0
I need a web crawler and indexer for linux

Does anyone know any good web crawler and also indexer?

I've tried a few, Larbin for example, but it only retrieves URLs. I need also an indexer for crawling the web (a small portion of it) to build an experimental search engine.

Old 10-28-2004, 12:26 AM   #2
Registered: Dec 2003
Location: Houston
Distribution: Knoppix,lenova yoga 3, Samsung s6 -android
Posts: 307

Rep: Reputation: 30
Try Grub
Grub is a distributed internet crawler/indexer designed to run on multi-platform systems, interfacing with a central server/database. It is used by LookSmart, as well, for it's Peer to Peer search and indexing.
Old 10-28-2004, 01:11 AM   #3
Registered: Oct 2004
Location: Northville, MI
Distribution: Slackware
Posts: 65

Rep: Reputation: 15
There are a lot of search engine projects out there. Harvest, htdig, etc. A large list of Unix based ones (commercial as well as open source) can be found on (specific page linked here). But if you're just looking to index your Web site for your visitors to search your site, check out siteLevel. siteLevel powers site search engines for tens of thousand of sites world wide, and they also provide very good technical support. There's a free service offering, as well as many affordable pay options. It's an ASP (application service provider) model, so there's no software to install; however, if you're looking for a server-side solution to install, check out the first link.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
wget as web spider/crawler kpachopoulos Linux - Software 2 08-27-2005 12:58 PM
Which is the widely used and best opensource crawler? coolguy_iiit Linux - Networking 1 01-08-2005 07:56 PM
some web pages are not web opening in linux emailssent Linux - Networking 4 09-19-2004 06:28 AM
linux web crawler demmylls Linux - Software 2 03-06-2004 08:56 AM
Can't access Linux web server web pages from LAN client jaydave Linux - Networking 4 03-16-2003 02:38 AM > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:31 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration