LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Get /robots.txt in my apache log (https://www.linuxquestions.org/questions/linux-server-73/get-robots-txt-in-my-apache-log-4175541734/)

said76 05-05-2015 10:50 PM

Get /robots.txt in my apache log
 
Hi,

I'm unable to access to my webmail site. Then, to find out why, I went to the apache log file and found this:

66.249.67.252 - - [06/May/2015:10:34:37 +1000] "GET /robots.txt HTTP/1.1" 200 26
66.249.67.240 - - [06/May/2015:11:11:46 +1000] "GET /robots.txt HTTP/1.1" 200 26

I got a feeling this is to do with googlebot. Could anyone share their thoughts on how to fix this.

My system runs on Ubuntu Server 12.04.5 32bit with apache version 2.4.12.

Thank you in advance

bathory 05-06-2015 02:06 AM

Quote:

Originally Posted by said76 (Post 5358553)
Hi,

I'm unable to access to my webmail site. Then, to find out why, I went to the apache log file and found this:

66.249.67.252 - - [06/May/2015:10:34:37 +1000] "GET /robots.txt HTTP/1.1" 200 26
66.249.67.240 - - [06/May/2015:11:11:46 +1000] "GET /robots.txt HTTP/1.1" 200 26

I got a feeling this is to do with googlebot. Could anyone share their thoughts on how to fix this.

My system runs on Ubuntu Server 12.04.5 32bit with apache version 2.4.12.

Thank you in advance

This has nothing to do with your problem. It's legitimate traffic from the googlebot, trying to index your site.
From this
Quote:

A robots.txt file is a file at the root of your site that indicates those parts of your site you don’t want accessed by search engine crawlers. The file uses the Robots Exclusion Standard, which is a protocol with a small set of commands that can be used to indicate access to your site by section and by specific kinds of web crawlers (such as mobile crawlers vs desktop crawlers).
If you can't access your webmail URL, check the apache error_log for errors.

Regards


All times are GMT -5. The time now is 07:29 PM.