What software do server farms use to be able pull out dead HDDs?

Mr. Alex · 02-15-2013, 06:24 AM

Hello people!

Server farms like Google and other big ones have tonns and tonns of HDDs working together and they have many HDDs die per day. If I'm not mistaken, IT wokers there just pull out dead HDDs from server racks and replace them with new ones on the fly, with no need to turn the servers off. Apparently they don't even care about on which disks their data is stored. It's like RAID but much more sophisticated than, say, RAID5 or RAID6 and one piece of data is stored on several HDDs spreaded all over server farm which may cover several huge buildings and even network of several buildings... At least it's how I understand it. Is it so? And if yes — what software do they use for this approach (to detect dead HDDs, their location and replace them on the fly)?

chrism01 · 02-15-2013, 07:10 AM

I believe you're broadly correct, but that they store the same info in more than one place and no-one worries if a fact does go missing; google search is not supposed to be definitive, much less ACID compliant.
The google spider bot will find it again ....

Google apps (docs, email etc) are different.

There's any num of ways of checking for bad disks; usually via snmp and some systems will notify you when a disk dies.

They probably don't bother for this app, but you can get HW RAID that allows hot swapping of disks and the OS never even notices

whizje · 02-15-2013, 07:41 AM

The disk must support hotplug, sata disk are supposed to be hotpluggable. The disk subsystem also must support hotplug. Mdraid also supports hotplug. For the os to support hotplug raid is required.

TB0ne · 02-15-2013, 09:32 AM

Quote:

Originally Posted by Mr. Alex

Hello people!
Server farms like Google and other big ones have tonns and tonns of HDDs working together and they have many HDDs die per day. If I'm not mistaken, IT wokers there just pull out dead HDDs from server racks and replace them with new ones on the fly, with no need to turn the servers off. Apparently they don't even care about on which disks their data is stored. It's like RAID but much more sophisticated than say RAID5 or RAID6 and one piece of data is stored on several HDDs spreaded all over server farm which may cover several huge buildings and even network of several buildings... At least it's how I understand it. Is it so? And if yes — what software do they use for this approach (to detect dead HDDs, their location and replace them on the fly)?

Yes, and when you say there's one piece of data spread out over multiple drives...that IS what RAID is. And they don't use software (per se), to do this, they use SAN's. The 'disk' is presented to a server via a fiber channel host-bus adapter (HBA). What that disk is, depends on how the SAN administrator presents it. It could be one part of one disk, one whole disk, or 20 whole disks, split into four array's of 5. The operating system will 'see' one disk. That's it. All the hot-plug and failover happens in the SAN....the OS never knows.

You can buy cheap (relatively), hardware RAID systems, and define an array with a hot-spare drive, and the OS will notice that a drive has failed, but the system will keep running. You can then. swap out the failed drive, and it will rebuild the array, and put the spare back into its previous state.

whizje · 02-15-2013, 02:38 PM

Google servers according to wikipedia

Quote:

Servers are commodity-class x86 PCs running customized versions of Linux. The goal is to purchase CPU generations that offer the best performance per dollar, not absolute performance. How this is measured is unclear, but it is likely to incorporate running costs of the entire server, and CPU power consumption could be a significant factor.[2] Servers as of 2009-2010 consisted of a custom made open top systems containing two processors (each with 2 cores[3]), a considerable amount of RAM spread over 8 DIMM slots housing double height DIMMS, and two SATA hard drives connected through a non-standard ATX sized power supply.[4] According to CNET and to a book by Hennessy, each server has a novel 12 volt battery to reduce costs and improve power efficiency [3][5]

And if you want to have a look at google datacenters

btmiller · 02-16-2013, 12:11 PM

In terms of detecting when a hard drive might fail, the utilities in the smartmontools package can be used, in particular the online self-test commands.

There are also distributed file systems (e.g. PVFS, Lustre) that automatically distribute data and metadata over multiple servers in a cluster. Google is using something like this, but they've written their own (if you search for GoogleFS, you should be able to find some high level descriptions of it).

Mr. Alex · 02-16-2013, 12:21 PM

Quote:

Originally Posted by btmiller

Google is using something like this, but they've written their own (if you search for GoogleFS, you should be able to find some high level descriptions of it).

Thanks for information.
There were some news about Google using ext4 (probably with their own modifications). They switched from ext2 without using ext3.

And, besides, I'm not only talking about Google. What about Twitter, Tumblr, Facebook, MySpace, Blekko, Wikipedia, Linkedin, PayPal, IBM, Intel, AMD...?

jefro · 02-16-2013, 02:33 PM

I'd think more like ZFS was used.