Problem with Nutch, Tomcat, or Java
Hoping you all might be able to help this newbie.
I recently installed Debian Etch, the latest JVM, Tomcat 5.5, and Nutch 0.9 . Was able to successfully crawl, index, and search the database from the terminal console, but got an error when trying to use the Nutch search page located at localhost:8180 (Tomcat's default). When performing a search query I get the following error: Code:
HTTP Status 500 - Any ideas on what this means and how to fix this? The folders for Java and Tomcat were installed in their default locations. Nutch was installed in /usr/local/nutch/ with the database index located at /usr/local/nutch/crawl . Thanks in advance, :newbie: |
To me it looks more as if it's a matter of permissions. Are you doing this as root or as a regular user? What about password protection on the database?
|
Thanks for your reply, jay73. Everything was done as root when using the terminal (installing packages, running the crawl, etc.). But when using the search page via the browser (here on the local network and from outside) it was just done normally, of course.
Nothing was done to the database to password protect it. I just followed the Nutch tutorial verbatim for our first crawl. Edit: Not sure if this is relevant. Nutch typically uses port 8080. Since installation I've never been able to get port 8080 open/listening. Ports 80 (apache2) and 8180 (tomcat) are fine. But all port scans (nmap) have showed that 8080 is closed. Could that be a problem? |
That would be one explanation. I'm not familiar with that kind of application but I'm sure it has a configuration file somewhere that allows assigning a different port. It's usually as easy as tracing down the relevant line in the file and manually replacing with a different port.
|
I located the server.xml file for tomcat and changed port 8180 to 8080 (then restarted tomcat). I was able to access Nutch's search page at localhost:8080 but it gave "less" of an error this time when doing a search. Which I guess is progress.
Code:
type Exception report So now I need to investigate the Java security a little more. |
Progress. I stumbled across this tutorial about Nutch and Debian Etch. It has some code that is suppose to be added to the policy file located at /etc/tomcat5.5/policy.d/04webapps.policy . The code is:
Code:
grant codeBase "file:/usr/share/tomcat5.5-webapps/-" { Thanks for your help, jay73. You were right, it was about permissions. Edit: Solved. The tutorial above mentions placing a searcher.dir variable (pointing back to the nutch crawl directory - /usr/local/nutch/crawl ) inside a nutch-site.xml file. I found the file within /usr/share/tomcat5.5-webapps/ROOT/WEB-INF/classes/ folder and added: Code:
<name>searcher.dir</name> Hopefully this will help if anyone else tries to get Nutch running on Debian Etch. Thanks! |
All times are GMT -5. The time now is 01:58 PM. |