LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices


Reply
  Search this Thread
Old 02-04-2004, 01:49 PM   #1
pnh73
Member
 
Registered: Jul 2003
Location: Birmingham, UK
Distribution: Ubuntu,Debian
Posts: 381

Rep: Reputation: 30
Distributed Webhosting


Hmmm, I had a rather strange/bizarre or even clever idea the other day. I have been using linux to run web-hosting for friends and an idea occured to me when my friend asked me how he could setup his own hosting.

I had the idea that a group of fairly competant Linux users with >= DSL connections, could setup servers in their homes and run a distributed web hosting system.

I was just wondering... How hard would that be to setup? It would be similar to having a server farm/cluster, just that each machine would be in a different house/location.

Do you guys think it could work? Are there any examples of this having already been done? What would be the recommended setup? Separate machines for www/SFTP, Mail, DNS etc OR a set of machines each running www/SFTP as mirrors with mail and DNS distributed between them?

If this hasnt been done before or if anyone is interested we could setup an experiment and see how it could work.

Thanks for your time

Regards,
 
Old 02-04-2004, 02:38 PM   #2
nielchiano
Member
 
Registered: Feb 2004
Location: 50N 3E
Distribution: Gentoo
Posts: 64

Rep: Reputation: 15
i' interested in helping out!!

The way I would deploy it is (concentrating on www: )

* set up a bunch of (almost) identical web servers.
* set up a DNS server (can also be a web server, but doesn' have to be).
* set that DNS server to give the IP's of all webservers in a round robin fashion.
* this will load-balance the servers

So, eg 3 servers: 192.168.0.{1,2,3} and the dns 192.168.0.4
the dns servers gets a request for the web-server and returns the 1st IP
on the next request, it returns the 2nd IP, .... the 4th request gets back nr1, ...

DNS can't be distributed on IP basis (it can be on a cluster, but IMO dns is 1 proces, so... it wouldn't matter).

Mail I think can't either be distributed....
 
Old 02-04-2004, 02:47 PM   #3
pnh73
Member
 
Registered: Jul 2003
Location: Birmingham, UK
Distribution: Ubuntu,Debian
Posts: 381

Original Poster
Rep: Reputation: 30
How would we make sure that they are kept up to date. So if a user (for example) logs in and uses SFTP and uploads their site. That it isnt just displayed every 3 requests (if 3 servers are used). It would require fair amount of traffic between hosts to update it all on the fly. Though if a lot was changed it would take ages to mirror it if you did it every 12/24 hr for example.

Ever used FreeBSD? I like it and its wat i run at the moment. If we can get a few more ppl i would consider setting up a sourceforge project or something similar.

Regards
---

P.S. If this system worked it could become a quite powerful way to host. I would guess you need to setup another DNS server to add redundancy...

Last edited by pnh73; 02-04-2004 at 02:48 PM.
 
Old 02-04-2004, 02:52 PM   #4
chort
Senior Member
 
Registered: Jul 2003
Location: Silicon Valley, USA
Distribution: OpenBSD 4.6, OS X 10.6.2, CentOS 4 & 5
Posts: 3,660

Rep: Reputation: 76
Well that method is essentially just a mirroring system. It's really not distributing the content, it's duplicating it.

If you wanted to be really crazy, you could have each system host certain sites, but NFS mount all the other sites (on the other hosts) through an IPSec tunnel to each of the other hosts. This would accomplish distributing the content, but you're going to have a real problem with performance I would think.

If you wanted to go with the above example of mirroring the same content to multiple hosts and serving it via DNS round-robin, then you would just have to rsync each host periodically (like, every hour) to make sure they all have the same content. Of course, you would need to figure out how to handle situations like if a file has been deleted from one host, how do you tell if it was really deleted on purpose? The other hosts are going to think that file is just missing so they will try to replace it.

So you might go with a central repository where all the edits happen, and that gets rsync'd out to the service hosts every hour or every few hours. Now you have a single point of failure (the repository) which will pretty much nuke your site (especially if it gets compromised, because it will cause the changes to replicate to the other hosts). Another problem is how to allow end-users to edit their files? If you have them log into the repository, again single point of failure. That's also letting people log into what should be your most secure box--not a good idea. On the other hand, if you have users login to the different remote hosts via DNS round robin that means you'll need to replicate all your user accounts to every host. You'll also need to replicate their authentication credentials. Regular UNIX password files won't work very well for that, so now you're looking at Kerberos or RADIUS.

I could go round and round with circular reasoning forever, but the point is that webhost is a whole lot more complicated than it seems.
 
Old 02-04-2004, 03:14 PM   #5
pnh73
Member
 
Registered: Jul 2003
Location: Birmingham, UK
Distribution: Ubuntu,Debian
Posts: 381

Original Poster
Rep: Reputation: 30
Quote:
I could go round and round with circular reasoning forever, but the point is that webhost if a shole lot more complicated than it seems.
It seems that distributed webhosting is being able to create a successful balance between redundancy and security and performance.
 
Old 02-04-2004, 03:30 PM   #6
david_ross
Moderator
 
Registered: Mar 2003
Location: Scotland
Distribution: Slackware, RedHat, Debian
Posts: 12,047

Rep: Reputation: 79
Another thing that you may need to consider is the replication of sql databases. A forum for example would not work well in this setup without realtime updates which is likely to be a strain on bandwidth.
 
Old 02-04-2004, 04:30 PM   #7
pnh73
Member
 
Registered: Jul 2003
Location: Birmingham, UK
Distribution: Ubuntu,Debian
Posts: 381

Original Poster
Rep: Reputation: 30
Ah, crumbs. Forgot about those. Though we could have a separate DB server which was backed up mirrored etc to another to add redundancy.
 
Old 02-04-2004, 05:22 PM   #8
chort
Senior Member
 
Registered: Jul 2003
Location: Silicon Valley, USA
Distribution: OpenBSD 4.6, OS X 10.6.2, CentOS 4 & 5
Posts: 3,660

Rep: Reputation: 76
Quote:
Originally posted by pnh73
It seems that distributed webhosting is being able to create a successful balance between redundancy and security and performance.
Redudency yes, performance maybe*, security no. Your entire system is as weak as the weakest host, since each system has to trust each other for data replication (plus your authentication would need to be replicated, unless you want it to be a total nightmare). This means that a cracker only has to get lucky (or smart) with one host and they can own your entire distributed network.

*performance is largely tied to how well you can distribute the network load, and DNS round-robin is notoriously bad at that. What happens if several clients simultaneously connect to a host that only has 128K/s up? What happens if one client has a really large download that takes say, 30 minutes (meanwhile round robin will be throwing other connections at the same box)? DNS round-robin can control the number of simultaneous connections, nor can it account for varying bandwidth capacity of different hosts.

Really there is nothing new going on here, you're just looking at it from the distributed standpoint right away. Most hosting companies start with a single location, then they have to figure out how to make it redudent later on. You're just skipping the first mistake. The rest of the picture is unchanged. You need a lot of complex equipment, intelligent load balances, database replicators, virtual private networks, etc, etc... It's no different than for instance, Rackspace.com, NTT/Verio webhosting, etc.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Webhosting, sort of.. IloveSuSE Linux - Newbie 0 05-19-2004 01:09 PM
WEBHOSTING Questions deepsweech Linux - Networking 1 09-01-2003 03:51 AM
multiple webhosting jcb_dreamvsat Linux - Newbie 4 07-06-2003 12:18 AM
Webhosting Help DragonHrt Linux - Newbie 1 03-27-2003 12:53 AM
Webhosting Help DragonHrt Linux - Software 3 03-26-2003 09:30 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 12:23 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration