LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 06-22-2011, 12:10 PM   #1
fornax
LQ Newbie
 
Registered: Jun 2011
Posts: 15

Rep: Reputation: Disabled
Softraid with mdadm -- cannot access disk anymore


Hi,
seems like I killed my softraid.
First, my initial setup: 2x 1.5TB SATA disks -> raid1 with mdadm -> lvm -> multiple LVs as luks partition
This worked for a while now, even though I made a bad mistake; I created the raid out of the whole
disks (instead of creating partitons on them). I didnt notice because it worked...

Now one disk failed, but I could still access the other disk which I moved to another server.
mdadm recognized it during boot, after vgscan --mknodes; vgchange -ay I saw the luks partitions in /dev/mapper/ and could mount them via luksOpen.
Went well several times, I did not use the disk anymore to avoid killing it also.

Just today when I wanted to move the stuff to my new raid, this way wont work anymore

First of all, dmesg reports a wrong size (500G instead of 1.5T)
Code:
[    1.943127] sd 3:0:0:0: [sdc] 976817134 512-byte logical blocks: (500 GB/465 GiB)
[    1.943153] sd 3:0:0:0: [sdc] Write Protect is off
[    1.943155] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[    1.943178] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.949054]  sdc:
[    1.949219] sd 3:0:0:0: [sdc] Attached SCSI disk
[   15.048051] device-mapper: table: 254:1: sdc too small for target: start=384, len=1572864000, dev_size=976817134
[   15.056441] device-mapper: table: 254:2: sdc too small for target: start=1572864384, len=314572800, dev_size=976817134
[   15.064770] device-mapper: table: 254:3: sdc too small for target: start=1887437184, len=314572800, dev_size=976817134
[   15.073133] device-mapper: table: 254:4: sdc too small for target: start=2202009984, len=134217728, dev_size=976817134
[   15.081571] device-mapper: table: 254:5: sdc too small for target: start=2336227712, len=104857600, dev_size=976817134
[   15.090038] device-mapper: table: 254:6: sdc too small for target: start=2441085312, len=52428800, dev_size=976817134
[   15.098465] device-mapper: table: 254:7: sdc too small for target: start=2493514112, len=419430400, dev_size=976817134
[   30.014485] device-mapper: table: 254:1: sdc too small for target: start=384, len=1572864000, dev_size=976817134
[   30.022915] device-mapper: table: 254:2: sdc too small for target: start=1572864384, len=314572800, dev_size=976817134
[   30.031366] device-mapper: table: 254:3: sdc too small for target: start=1887437184, len=314572800, dev_size=976817134
[   30.039796] device-mapper: table: 254:4: sdc too small for target: start=2202009984, len=134217728, dev_size=976817134
[   30.048219] device-mapper: table: 254:5: sdc too small for target: start=2336227712, len=104857600, dev_size=976817134
[   30.056677] device-mapper: table: 254:6: sdc too small for target: start=2441085312, len=52428800, dev_size=976817134
[   30.065090] device-mapper: table: 254:7: sdc too small for target: start=2493514112, len=419430400, dev_size=976817134
mdadm does not set up anything.
Code:
root@auriga:~# mdadm --examine /dev/sdc
mdadm: No md superblock detected on /dev/sdc.
However, the lvm recognizes all LVs (which should be in the raid partition!). For example, part of lvdisplay output:
Code:
 --- Logical volume ---
  LV Name                /dev/raid/raid_upload
  VG Name                raid
  LV UUID                GbXFkZ-YY0M-XcTs-oYYZ-hm02-TEqN-xWGewe
  LV Write Access        read/write
  LV Status              suspended
  # open                 0
  LV Size                50.00 GiB
  Current LE             12800
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:5
Not surprisingly, When I create the nodes and try to open these LVs with cryptsetup it fails with code 15.

What can I try to get my data back? Any help is greatly appreciated since this IS my backup
 
Old 06-22-2011, 01:09 PM   #2
never say never
Member
 
Registered: Sep 2009
Location: Indiana, USA
Distribution: SLES, SLED, OpenSuse, CentOS, ubuntu 10.10, OpenBSD, FreeBSD
Posts: 195

Rep: Reputation: 37
Not sure how much help I can be, but I do have a few questions.

1. Have you changed the hardware the drive is connected to (from when you could get mdadm to recognize the drive)?

2. Does the BIOS report the correct size for the drive?

3. Any other changes made between the time you could mount the drive and now?

I assume from your questions that there is no backup. If that is the case my first course of action would be to clone the drive either with dd or clonezilla, that way you have a fall back.
 
Old 06-22-2011, 01:33 PM   #3
fornax
LQ Newbie
 
Registered: Jun 2011
Posts: 15

Original Poster
Rep: Reputation: Disabled
1. No, exactly the same. The enclosure was switched off a while. Doesnt make a difference whether its switched on during boot or not.
2. Unfortunately the controller bios (onboard) is too fast to see anything, the typical keys wont stop it either. The "regular" bios only reports the disk ID for the boot sequence.
3. Nothing which could have an impact like kernel etc. The last time it worked was the last time I used the machine (switched it off after that).

Do you know where init gets its information about the disksize? Its hard to get anything out of google about this.
Backing up the whole disk sounds like a good idea but doesnt dd stop at the last block of the 500GB?

Edit: I just tried another board, the bios reports 500GB, too. Is there some way to change this (with hdparm maybe)?

Last edited by fornax; 06-23-2011 at 05:53 AM. Reason: new info
 
Old 06-23-2011, 07:38 AM   #4
never say never
Member
 
Registered: Sep 2009
Location: Indiana, USA
Distribution: SLES, SLED, OpenSuse, CentOS, ubuntu 10.10, OpenBSD, FreeBSD
Posts: 195

Rep: Reputation: 37
I replied to this, but don't see it here this morning, odd. Anyway...

Is this an external drive, or mounted in an external enclosure?

Connected by PATA, SATA eSATA, USB??

You are correct you can't clone the drive until your system sees it correctly. I would look up your motherboard and find out how to access the BIOS. Another thing you could try is to see if the drive manufacturer has a Diagnostics image you could boot to to see if the drive is in good shape (Be careful, some of that software is not real careful about making sure not to overwrite anything on the drive).

I just saw your edit that another motherboard see the drive as 500GB too. I think you really need to look at the drive manufacturer and see if they have diagnostics software. That would be my next step.
 
Old 06-23-2011, 08:05 AM   #5
fornax
LQ Newbie
 
Registered: Jun 2011
Posts: 15

Original Poster
Rep: Reputation: Disabled
Its a Samsung HD154UI (both), internal SATA. I use those switchable 3.5" enclosures, dont know the correct term, here it is called "Wechselrahmen"
I booted the Samsung EStool now, the normal check says "your hdd has no errors".
The "Drive Information" screen is interesting:
Current Size: 476961MB (LBA : 976817133)
Native Size : 1430799MB (LBA : 2930277168)

There is a menu where I can select "SET MAX ADDRESS" -> "RECOVER NATIVE SIZE".
Sounds just like what I need, should I give it a try? I have never messed with my disks this way, Im a bit scared...

For completeness, the hdparm output of both disks (sdc is the really broken one, sdd the one Im trying to save)
Code:
root@leyanda:~# hdparm -g /dev/sdc
/dev/sdc:
 geometry      = 182401/255/63, sectors = 2930275055, start = 0
root@leyanda:~# hdparm -g /dev/sdd
/dev/sdd:
 geometry      = 60804/255/63, sectors = 976817134, start = 0
 
Old 06-23-2011, 09:55 AM   #6
never say never
Member
 
Registered: Sep 2009
Location: Indiana, USA
Distribution: SLES, SLED, OpenSuse, CentOS, ubuntu 10.10, OpenBSD, FreeBSD
Posts: 195

Rep: Reputation: 37
I agree that you probably need to "RECOVER NATIVE SIZE". However, I don't know exactly what that does. I don't know if it will clear the drive or simply re-assign the correct drive geometry.

Is the drive still under warranty? If so I would call Samsung support and ask them what they recommend.

How important is the data on the drive?

Do you still have the failed drive? If so, do you know why that drive failed? Have you tried to recover from that drive in any way? I have used some drive recovery software (Spintrite)successfully before on failed drives, I have also had it kill off drives with mechanical issues.

Since there is no backup if the data is critical I would speak to a data recovery company, if it's personal use type stuff I would try changing the drive geometry with the drive software figuring if it works I was golden, if not I didn't have the data anyway. MAKE SURE TO RECORD ALL VALUES BEFORE AND AFTER RUNNING THE DRIVE SOFTWARE.
 
Old 06-23-2011, 10:25 AM   #7
fornax
LQ Newbie
 
Registered: Jun 2011
Posts: 15

Original Poster
Rep: Reputation: Disabled
The data is important, however not that important it would justify a professional recovery.
I tried the native size recovery, it did the trick
At least I was able to mount a 750G LV from the disk and can copy the files in there, I think the other LVs will work as well.
Strange about this: The lvm detects the LVG directly on the disk, even though it should be a softraid member. The superblock
is still missing so its not assembled into a md*.

Thanks a lot for your help! I owe you a beer or something...
 
Old 06-23-2011, 11:39 AM   #8
never say never
Member
 
Registered: Sep 2009
Location: Indiana, USA
Distribution: SLES, SLED, OpenSuse, CentOS, ubuntu 10.10, OpenBSD, FreeBSD
Posts: 195

Rep: Reputation: 37
Great! Glad you were able to recover your data.

I don't want to say this with certainty, but I think the reason you were able to mount the LV is that you said you assigned the entire drive to Raid1 before partitioning. If it had been partitioned or if it had been a different Raid level, then the LV created it probably would not have worked.

Anyway, glad you got your data. Next step a backup and recovery plan

Just an FYI I use two 2.5" drives to backup my data. I keep one drive at work(used to keep it at my parents), the other drive in a document safe at home. I rotate them every week or so, even if my house burns down I am safe.

I learned many years ago raid (no matter what level) is not a backup solution.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Software RAID (mdadm) - RAID 0 returns incorrect status for disk failure/disk removed Marjonel Montejo Linux - General 4 10-04-2009 07:15 PM
mdadm failed disk, why? mikesjays Linux - Hardware 3 06-28-2009 10:36 PM
help with mdadm disk failure ufmale Linux - Server 3 05-29-2008 10:59 AM
mdadm : disk replacement jsurles Linux - General 1 12-04-2007 04:20 PM
softraid-1: Disk name changes when plug in PhillipHuang Linux - General 2 02-13-2007 10:30 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 10:33 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration