LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 02-15-2013, 06:24 AM   #1
Mr. Alex
Senior Member
 
Registered: May 2010
Distribution: No more Linux. Done with it.
Posts: 1,238

Rep: Reputation: Disabled
Question What software do server farms use to be able pull out dead HDDs?


Hello people!

Server farms like Google and other big ones have tonns and tonns of HDDs working together and they have many HDDs die per day. If I'm not mistaken, IT wokers there just pull out dead HDDs from server racks and replace them with new ones on the fly, with no need to turn the servers off. Apparently they don't even care about on which disks their data is stored. It's like RAID but much more sophisticated than, say, RAID5 or RAID6 and one piece of data is stored on several HDDs spreaded all over server farm which may cover several huge buildings and even network of several buildings... At least it's how I understand it. Is it so? And if yes — what software do they use for this approach (to detect dead HDDs, their location and replace them on the fly)?

Last edited by Mr. Alex; 02-16-2013 at 12:22 PM.
 
Old 02-15-2013, 07:10 AM   #2
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,360

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
I believe you're broadly correct, but that they store the same info in more than one place and no-one worries if a fact does go missing; google search is not supposed to be definitive, much less ACID compliant.
The google spider bot will find it again ....

Google apps (docs, email etc) are different.

There's any num of ways of checking for bad disks; usually via snmp and some systems will notify you when a disk dies.

They probably don't bother for this app, but you can get HW RAID that allows hot swapping of disks and the OS never even notices
 
1 members found this post helpful.
Old 02-15-2013, 07:41 AM   #3
whizje
Member
 
Registered: Sep 2008
Location: The Netherlands
Distribution: Slackware64 current
Posts: 594

Rep: Reputation: 141Reputation: 141
The disk must support hotplug, sata disk are supposed to be hotpluggable. The disk subsystem also must support hotplug. Mdraid also supports hotplug. For the os to support hotplug raid is required.
 
Old 02-15-2013, 09:32 AM   #4
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,655

Rep: Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970
Quote:
Originally Posted by Mr. Alex View Post
Hello people!
Server farms like Google and other big ones have tonns and tonns of HDDs working together and they have many HDDs die per day. If I'm not mistaken, IT wokers there just pull out dead HDDs from server racks and replace them with new ones on the fly, with no need to turn the servers off. Apparently they don't even care about on which disks their data is stored. It's like RAID but much more sophisticated than say RAID5 or RAID6 and one piece of data is stored on several HDDs spreaded all over server farm which may cover several huge buildings and even network of several buildings... At least it's how I understand it. Is it so? And if yes — what software do they use for this approach (to detect dead HDDs, their location and replace them on the fly)?
Yes, and when you say there's one piece of data spread out over multiple drives...that IS what RAID is. And they don't use software (per se), to do this, they use SAN's. The 'disk' is presented to a server via a fiber channel host-bus adapter (HBA). What that disk is, depends on how the SAN administrator presents it. It could be one part of one disk, one whole disk, or 20 whole disks, split into four array's of 5. The operating system will 'see' one disk. That's it. All the hot-plug and failover happens in the SAN....the OS never knows.

You can buy cheap (relatively), hardware RAID systems, and define an array with a hot-spare drive, and the OS will notice that a drive has failed, but the system will keep running. You can then. swap out the failed drive, and it will rebuild the array, and put the spare back into its previous state.
 
1 members found this post helpful.
Old 02-15-2013, 02:38 PM   #5
whizje
Member
 
Registered: Sep 2008
Location: The Netherlands
Distribution: Slackware64 current
Posts: 594

Rep: Reputation: 141Reputation: 141
Google servers according to wikipedia
Quote:
Servers are commodity-class x86 PCs running customized versions of Linux. The goal is to purchase CPU generations that offer the best performance per dollar, not absolute performance. How this is measured is unclear, but it is likely to incorporate running costs of the entire server, and CPU power consumption could be a significant factor.[2] Servers as of 2009-2010 consisted of a custom made open top systems containing two processors (each with 2 cores[3]), a considerable amount of RAM spread over 8 DIMM slots housing double height DIMMS, and two SATA hard drives connected through a non-standard ATX sized power supply.[4] According to CNET and to a book by Hennessy, each server has a novel 12 volt battery to reduce costs and improve power efficiency [3][5]
And if you want to have a look at google datacenters
 
Old 02-16-2013, 12:11 PM   #6
btmiller
Senior Member
 
Registered: May 2004
Location: In the DC 'burbs
Distribution: Arch, Scientific Linux, Debian, Ubuntu
Posts: 4,290

Rep: Reputation: 378Reputation: 378Reputation: 378Reputation: 378
In terms of detecting when a hard drive might fail, the utilities in the smartmontools package can be used, in particular the online self-test commands.

There are also distributed file systems (e.g. PVFS, Lustre) that automatically distribute data and metadata over multiple servers in a cluster. Google is using something like this, but they've written their own (if you search for GoogleFS, you should be able to find some high level descriptions of it).
 
Old 02-16-2013, 12:21 PM   #7
Mr. Alex
Senior Member
 
Registered: May 2010
Distribution: No more Linux. Done with it.
Posts: 1,238

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by btmiller View Post
Google is using something like this, but they've written their own (if you search for GoogleFS, you should be able to find some high level descriptions of it).
Thanks for information.
There were some news about Google using ext4 (probably with their own modifications). They switched from ext2 without using ext3.

And, besides, I'm not only talking about Google. What about Twitter, Tumblr, Facebook, MySpace, Blekko, Wikipedia, Linkedin, PayPal, IBM, Intel, AMD...?

Last edited by Mr. Alex; 02-16-2013 at 12:36 PM.
 
Old 02-16-2013, 02:33 PM   #8
jefro
Moderator
 
Registered: Mar 2008
Posts: 21,993

Rep: Reputation: 3628Reputation: 3628Reputation: 3628Reputation: 3628Reputation: 3628Reputation: 3628Reputation: 3628Reputation: 3628Reputation: 3628Reputation: 3628Reputation: 3628
I'd think more like ZFS was used.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: New Virtualization Tool Plows Field for Big Server Farms LXer Syndicated Linux News 0 08-05-2008 09:40 AM
LXer: Server Farms Live Off Open Source LXer Syndicated Linux News 0 07-26-2006 05:21 PM
Accessing other HDDs/Booting from other HDDs Namatacka Ubuntu 2 05-07-2006 11:21 AM
can't pull software off of disk 2 or 3?? Motown Mandriva 0 06-30-2004 02:01 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 12:04 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration