Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Running Ubuntu 8.10 Server, headless but have Putty.
Hardware is an Asus mb with AMD 6400 dual core processor and 5 HDDs - an 80 gb HDD holding the operating system, and 4 250 gb HDDs in a RAID5 array.
APPARENTLY, two of the HDDs in the RAID5 array decided to crash simultaneously.
My FIRST task is to figure out WHICH two of my drives crashed.
Here is what I DO know about my drives:
/dev/sda, case slot 1, sata slot 5, 80 gb, model# WDC-WD800JD-00MS, serial# WMAM9CRJ6471
/dev/sdb, case slot 2, sata slot 6, 250 gb, model# WDC-WD2500AAKS-0, serial# WMART1755547
/dev/sdc, case slot 3, sata slot 1, 250 gb, model# WDC-WD2500AAKS-0, serial# WMART1760390
/dev/sdd, case slot 4, sata slot 3, 250 gb, model# WDC-WD2500AAKS-0, serial# WMAT15924203
/dev/sde, case slot 5, sata slot 2, 250 gb, model# WDC-WD2500AAKS-0, serial# WMAT15923873
I know that two of my drives have crashed because when I try to assemble the array, it returns the message:
Quote:
/dev/md/0 assembled from 2 drives - not enough to start the array.
HOW can I tell which of my 4 250 gb HDDs has failed? I do still have Putty working & can get to the server, but it appears that the OS drive is the only one functioning properly.
Research revealed the "fdisk -l" command, which I've run, and received the following results:
Quote:
root@RCH-SERVER:/etc# fdisk -l
Disk /dev/sda: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x41413535
Device Boot Start End Blocks Id System
/dev/sda1 1 30401 244196001 fd Linux raid autodetect
Disk /dev/sdb: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0002d3a2
Device Boot Start End Blocks Id System
/dev/sdb1 1 30401 244196001 fd Linux raid autodetect
Disk /dev/sdc: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00010f8f
Device Boot Start End Blocks Id System
/dev/sdc1 1 30401 244196001 fd Linux raid autodetect
Disk /dev/sdd: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0005380d
Device Boot Start End Blocks Id System
/dev/sdd1 * 1 9327 74919096 83 Linux
/dev/sdd2 9328 9729 3229065 5 Extended
/dev/sdd5 9328 9729 3229033+ 82 Linux swap / Solaris
Disk /dev/sde: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000b3fcd
Device Boot Start End Blocks Id System
/dev/sde1 1 30401 244196001 fd Linux raid autodetect
root@RCH-SERVER:/etc#
So, maybe I don't have a HDD failure, but I'm definitely unsure of what's going on, and I need to do something, but don't know what. Thanks.
I'm still looking back into previous posts that might help me either solve this problem, or at least find commands that will help gather info so that somebody might be able to provide some insight into a solution. Here's another bit of info:
I found the command: "cat /proc/mdstat" & it returned the following info:
& here's yet another that I've found - the "lshw" command, which has returned the following info (I've deleted the massive amount of info that does not appear pertinent to the current situation, & have bolded the info which appears most important):
*-ide
description: IDE interface
product: SB700/SB800 IDE Controller
vendor: ATI Technologies Inc
physical id: 14.1
bus info: pci@0000:00:14.1
logical name: scsi4
logical name: scsi5
version: 00
width: 32 bits
clock: 66MHz
capabilities: ide msi bus_master cap_list emulated
configuration: driver=pata_atiixp latency=64 module=pata_atiixp
*-cdrom
description: DVD reader
product: ROM
vendor: 16X DVD-
physical id: 0
bus info: scsi@4:0.0.0
logical name: /dev/cdrom
logical name: /dev/dvd
logical name: /dev/scd0
logical name: /dev/sr0
version: 107G
capabilities: removable audio dvd
configuration: ansiversion=5 status=nodisc
*-disk:0 (This is my HDD that's dedicated to the OS)
description: ATA Disk
product: WDC WD800JD-00MS
vendor: Western Digital
physical id: 1
bus info: scsi@5:0.0.0
logical name: /dev/sdd
version: 10.0
serial: WD-WMAM9CRJ5825
size: 74GiB (80GB)
capabilities: partitioned partitioned:dos
configuration: ansiversion=5 signature=0005380d
So, my 80 gb HDD is a different serial number from what I documented, but that's not such a big deal. I purchased several of the 80 gb drives and set two of them up with the Linux OS on them. They are swappable, so I documented one of them, but not the other, but they are identical, except that one was installed & configured & then removed, while the other was installed & configured & left in to be used. No upgrades have been done, no changes in configuration have been done, no new software has been installed.
In fact, we left town for a week & I turned off the server. When we got back yesterday, I turned on the whole system (DSL modem, DSL router, File Server - the one in question, work stations, & network printer - a Xerox Phaser solid ink printer that connects via ethernet on built-in print server), and it worked just fine for a full day before starting to fail by first not letting me access the drive to MAKE new folders (for the pics we took), and then by not mounting the drive properly when I rebooted.
And, it appears that all the HDDs are present & accounted for, especially the 4 drives that are 250 gb each and RAID5'd. However, two are in state=unknown, so those are the ones that I need to get mounted. THIS COULD BE A CLUE, but I'm too tired to think about it right now.
I'm through looking at this for tonight, and will pick up on it again tomorrow.
I have a feeling that the solution is fairly simple. But, this plays into my theory that the simplest solutions require the most head-banging. I'm now going to go bang my head on my pillows!
Gosh, I'm flummoxed. I've tried loading Western Digital's diagnostics. I was able to burn the DOS based ISO to a bootable CD, but it gives me the message
Quote:
Unable to locate the License Agreement file, DLGLICE.TXT!!!
Please make sure that the License Agreement file is located
in the same path as DLGDIAG.EXE..."
But, it's right there on the disk. So, I can't even get the diagnostics to load.
If anyone has any ideas, I'd be most appreciative.
mdadm: /dev/sde does not appear to be an md device
& I tried:
root@RCH-SERVER:/dev# mdadm --detail md0
Quote:
mdadm: md device md0 does not appear to be active.
So, How do I make MD0 active?
I'm off to work now, will be back in about 9-10 hours, and will continue addressing this issue.
Thank you very much for your thoughts. I'll read the link thoroughly after I get back, but upon first view, I'm not sure I can figure out what I need to know.
I'm having a similar problem, although one of my drives did crash. I get the same mdadm error saying only 2 drives are present (when I have 3/4) and the raid can't be assembled. I'll be watching this post to see if it can help me resolve my problem.
If you do then you run the diagnostic tool along with the command and it will tell you which drive(s) going.
No, it's a software raid, using mdadm. The problem is no longer knowing which HDDs have gone because they're both marked "state=unknown" (/dev/sdc1 & /dev/sde1).
The problem has become one of trying to get them manually mounted because they all appear to be spares.
Final update: The only solution was, with the RAID5 array unmounted, was to connect just the OS HDD and the damaged drive and a brand new blank drive, download & install ddrescue, and then run through multiple copy routines to get every possible bit of data off the old/damaged drive. I copied from the damaged drive with it in every possible position except having it in the freezer (because freezing it can move things around inside just enough to get a good reading, sometimes). Once I had that done, then I had to reassemble the RAID5 array, still unmounted, and (still unmounted) had to have it recreate the data, (fdisk with an option, I believe), and then mount the drive. Once done, I copied ALL the data to a 1 TB drive in my XP box, and found that there were 3 pictures that would not copy because the bytes were set to ZERO. There were 3 other pics that had the exact same data. So, from what I can tell, out of 12,000 pictures PLUS a total of 250 GB data, I lost less than 10 mb in six files. Not bad.
My choices were to use DDRESCUE or SpinWrite, and DDRESCUE is Linux, so that's the way I went.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.