LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > General
User Name
Password
General This forum is for non-technical general discussion which can include both Linux and non-Linux topics. Have fun!

Notices


Reply
  Search this Thread
Old 01-05-2005, 02:22 AM   #1
vharishankar
Senior Member
 
Registered: Dec 2003
Distribution: Debian
Posts: 3,178
Blog Entries: 4

Rep: Reputation: 138Reputation: 138
Unhappy MSN bot sucking up my bandwidth


I have been going through the website stats for my site using Analog and Webalizer in Cpanel and found a very interesting thing. I also have some questions regarding this. I would be glad if anybody could answer this.

The MSN bot has been sucking up a lot of bandwidth on my site. For example, on a single visit I found that it had used approximately about 10 MB (that is 10000+ Kilobytes) of the bandwidth. Although this may not sound like much, I'm not sure how much total MB of bandwidth the MSN bots use up in my site per month because it seems to be a frequent visitor on my site. Do you think this is normal?

Is it OK to completely ban this MSN bot from my site? The problem is that, it appears to be indexing my site for search at MSN, but so far my site doesn't even register on an MSN search page. Even when I type the full URL the MSN search doesn't find my site.

Has anybody else owning a website had this problem of MSN bots using up a lot of bandwidth? My host provides Apache 1.x/Linux webserver.
 
Old 01-05-2005, 02:58 AM   #2
scuzzman
Senior Member
 
Registered: May 2004
Location: Hilliard, Ohio, USA
Distribution: Slackware, Kubuntu
Posts: 1,851

Rep: Reputation: 47
Take a look here. Specifically, here:
Quote:
How do I submit my web page to be indexed in MSN Search?
MSNBot is not contributing directly to MSN Search at this time. Please visit the MSN Search submit a site page.
and here
Quote:
Why is MSNBot trying to access a robots.txt file that is not on my server?
The robots.txt file is used by webmasters to prevent web crawlers from downloading some or all of the information on their websites. For information on how to create a robots.txt file, see The Robot Exclusion Standard. If you want to prevent the "File not found" error messages from appearing in your server log, create an empty file named robots.txt.
also here
Quote:
How do I prevent MSNBot from crawling some or all of my website?
The robots.txt file is used to prevent web crawlers from accessing a web site. The format of the robots.txt file is specified in The Robot Exclusion Standard. MSNBot analyzes all instances where the User-Agent is specified as either "msnbot" or "*". Based on this, MSNBot crawls only the web pages that allow it to do so.

Last edited by scuzzman; 01-05-2005 at 02:59 AM.
 
Old 01-05-2005, 03:09 AM   #3
vharishankar
Senior Member
 
Registered: Dec 2003
Distribution: Debian
Posts: 3,178

Original Poster
Blog Entries: 4

Rep: Reputation: 138Reputation: 138
Yes. I read their promotional page.

But in typical Microsoft fashion, they want us to "submit" our site manually for it to be included in their search, a euphemism for "advertise". But I'm going to ban that robot from my site anyway. They use up my paid-for bandwidth in this manner and my site does not even get listed in their search engine

The MSN bot was in the top 5 visitors list in my site stats in terms of visits and also in terms of KB. I can live without MSN search, but I will not pay for its visits to my site with my bandwidth which I want to conserve for the genuine visitors.
 
Old 01-05-2005, 03:50 AM   #4
scuzzman
Senior Member
 
Registered: May 2004
Location: Hilliard, Ohio, USA
Distribution: Slackware, Kubuntu
Posts: 1,851

Rep: Reputation: 47
I'd ban it until it comes out of Beta...
 
Old 01-05-2005, 11:13 PM   #5
rksprst
Member
 
Registered: Jan 2004
Distribution: OS X 10.4
Posts: 172
Blog Entries: 1

Rep: Reputation: 30
I had the following stats:
Quote:
Googlebot 119.91 MB Dec 2004 - 13:28
MSNBot 715.38 KB Dec 2004 - 14:57
AskJeeves 119.58 KB Dec 2004 - 19:11
Inktomi Slurp 101.43 KB Dec 2004 - 13:07
Alexa (IA Archiver) 75.61 KB 05 Dec 2004 - 06:24
I posted that google used up 120mb of bandwith at my web hoster's forum and another user said for his site its similar, so i think that its normal.

Edit: Btw, who do you use for web hosting? i can't imagine that 10mb is that much. I pay $4 a month and I have 10GB of bandwith space.

Last edited by rksprst; 01-05-2005 at 11:16 PM.
 
Old 01-06-2005, 12:18 AM   #6
vharishankar
Senior Member
 
Registered: Dec 2003
Distribution: Debian
Posts: 3,178

Original Poster
Blog Entries: 4

Rep: Reputation: 138Reputation: 138
I agree that 10 MB is not much. But I'm talking about 10 MB on a single visit! Is this really possible?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Vsftpd - how to limit download bandwidth but unlimited upload bandwidth? mpls mikeg Linux - Software 3 08-13-2005 02:52 PM
How prevent firefox from sucking up so much memory? pdmackenzie Linux - Software 1 06-03-2005 11:33 PM
apache sucking memory dry untoldone Linux - Software 1 04-11-2005 03:04 AM
sucking up for burner help EMilstone Linux - Newbie 2 10-13-2003 04:41 PM
sucking up for burner help EMilstone Linux - Newbie 3 10-13-2003 09:42 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > General

All times are GMT -5. The time now is 10:04 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration