LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Security
User Name
Password
Linux - Security This forum is for all security related questions.
Questions, tips, system compromises, firewalls, etc. are all included here.

Notices


Reply
  Search this Thread
Old 01-16-2013, 01:24 AM   #1
neonsignal
Senior Member
 
Registered: Jan 2005
Location: Melbourne, Australia
Distribution: Debian Bookworm (Fluxbox WM)
Posts: 1,391
Blog Entries: 54

Rep: Reputation: 360Reputation: 360Reputation: 360Reputation: 360
two hourly web page request of unknown origin


I'm wondering about the origin of some page requests to my website. I don't think they are malicious, but I'm curious as to the source or the reason.

The server is running nginx, and the website only has a small set of pages

The requests of interest come in at two hourly intervals, in pairs from Chinese and Japanese sources. The bot only does a 'GET' of the root web page, not any other pages, and identifies as a Mozilla browser.

My first guess is that it might be the Baidu spider, but it puzzles me why it would do frequent checks on a small low-traffic site that hardly ever changes, and only check a single page. Any thoughts?

A typical section of the log is as follows (excluding normal traffic):
Code:
202.46.59.140 - - [15/Jan/2013:07:52:43 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.195 - - [15/Jan/2013:07:53:15 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.53.163 - - [15/Jan/2013:09:53:17 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.196 - - [15/Jan/2013:09:53:51 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.53.74 - - [15/Jan/2013:11:55:24 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.195 - - [15/Jan/2013:11:55:59 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.55.28 - - [15/Jan/2013:13:51:55 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.131 - - [15/Jan/2013:13:52:25 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.56.143 - - [15/Jan/2013:15:55:00 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.194 - - [15/Jan/2013:15:55:37 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.51.124 - - [15/Jan/2013:17:54:25 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.131 - - [15/Jan/2013:17:55:00 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.56.128 - - [15/Jan/2013:20:01:25 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.131 - - [15/Jan/2013:20:02:15 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.58.22 - - [15/Jan/2013:21:51:50 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.195 - - [15/Jan/2013:21:52:20 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.56.160 - - [15/Jan/2013:23:56:28 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.132 - - [15/Jan/2013:23:57:08 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.50.136 - - [16/Jan/2013:01:50:51 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.195 - - [16/Jan/2013:01:51:19 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.58.32 - - [16/Jan/2013:03:55:36 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.196 - - [16/Jan/2013:03:56:13 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.53.134 - - [16/Jan/2013:05:55:13 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.195 - - [16/Jan/2013:05:55:49 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.59.194 - - [16/Jan/2013:07:53:33 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.194 - - [16/Jan/2013:07:54:06 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.62.116 - - [16/Jan/2013:09:52:07 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.131 - - [16/Jan/2013:09:52:38 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.53.134 - - [16/Jan/2013:11:55:39 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.131 - - [16/Jan/2013:11:56:16 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.54.34 - - [16/Jan/2013:13:52:34 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.132 - - [16/Jan/2013:13:53:05 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
202.46.56.50 - - [16/Jan/2013:15:53:59 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
119.63.193.131 - - [16/Jan/2013:15:54:34 +1100] "GET / HTTP/1.1" 200 1812 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"

Last edited by neonsignal; 01-16-2013 at 03:05 AM.
 
Old 01-16-2013, 08:23 AM   #2
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Quote:
Originally Posted by neonsignal View Post
I'm wondering about the origin of some page requests to my website.(..) it might be the Baidu spider,
Any self-respecting Spider uses its own UA and most have a set of IP ranges requests originate from. That I think is about all you can read from those log entries. Anything else must be learned from other resources or may amount to speculation.
 
Old 01-16-2013, 05:11 PM   #3
neonsignal
Senior Member
 
Registered: Jan 2005
Location: Melbourne, Australia
Distribution: Debian Bookworm (Fluxbox WM)
Posts: 1,391

Original Poster
Blog Entries: 54

Rep: Reputation: 360Reputation: 360Reputation: 360Reputation: 360
Quote:
Originally Posted by unSpawn View Post
Anything else must be learned from other resources or may amount to speculation.
I am interested in your speculation too! Or pointers to resources.

Quote:
Originally Posted by unSpawn View Post
Any self-respecting Spider uses its own UA
Indeed.

Well, I've just used whois.domaintools.com to find the IP ranges, giving me 202.46.32.0-202.46.63.255 as ShenZhen Sunrise Technology, which I see is an electronics company that is associated with Baidu (I'm not sure of the exact connection, but Baidu has invested a significant amount into their R&D centre in ShenZhen). And for the other IP range 119.63.192.0-119.63.199.255 as Baidu Japan.

So it looks like it is the (self-deprecating) Baidu spider.

The access patterns still seem a bit strange compared to the Google spider (the latter regularly scans the whole website, spread over many days).

Last edited by neonsignal; 01-16-2013 at 05:21 PM.
 
Old 01-16-2013, 06:24 PM   #4
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
I'd say just throttle both 119.63.193.0/24 and 202.46.59.0/24 down and be done with it...
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
executing linux commands from web page and outputing it back to the web page ashes_sheldon Programming 9 02-28-2015 12:07 AM
Last modification date of web page unknown Majed17 Linux - Software 10 01-02-2013 11:36 PM
[SOLVED] Downloaded complete web page with wget but browser wants internet to open page? SharpyWarpy Linux - General 15 08-16-2012 04:57 AM
script in cron.hourly not running hourly unholy Linux - Software 2 09-19-2006 08:21 PM
Bad Request when web page loading wmartino Linux - Networking 2 09-14-2003 09:25 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Security

All times are GMT -5. The time now is 08:29 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration