Hi,
I am faced with a little challenge: I have to index a whole site and make it searchable by terms. All links have to be followed, and every part of the website should be charted. The website is php and cgi based, so simply downloading the main directories does not work.
Where would you start for a project like this? Are there some cool webspiders which follow every single link on a website and then download the entire site (something similar to what google is doing, maybe?)
Basically, what i am trying to do is play google for only one website, but index ALL of it, not only a few google-like percent. The search interface can be anything, but i'd prefer web based over all others.
Is there a better way to do this?
Any other suggestions?
THANK YOU so much!
PS.: it is only one (medium size) site. So bandwidth and storage will not be a problem.