Raid failure - Need help
Yesterday, after a restart of the server, we had to make the horrifying experience that our home folder had been rollbacked for almost half a year!
We immediatly contacted our server Provider (Hetzner) and they told us that the raid md127 didn't start up correctly and that we should reload the raid manually. I wanted to get help from this forum, since i am new to this whole subject and since there is a high risk in losing all your data if approaching falsely to this issue. To our problem: When typing in "cat /proc/mdstat": http://puu.sh/2dtXE (Screenshot) We can see, that md127 marks a _U. As far as i understood that means sdba4 can't be loaded but sdb4 is loaded. When having a closer look into md127 with "mdadm -D /dev/md127": http://puu.sh/2du19 (Screenshot) We see, that partition number 0 has been removed and 1 is running I would have also given you the etc/raidtab but for some reason its missing on our root! As mentioned above i don't really know how to approach in such a case, do i just reactivate the raid with commands, copy it over to another, or do we even have to get the disk swapped? I would be very thankful for any kind of help and advice you can give me. I am kind of scared to lose our important data, thats why i am asking here :S I hope you have comprehension for that :( Thanks in advance ravand |
1. How about /etc/mdadm.conf ?
2. what distro+version Code:
uname -a This is not a good idea if one disk goes bad, all RAID sets would be affected. md0 = sda1, sdb1 md1 = sda2, sdb2 md2 = sda3, sdb3 & I suspect md3 should be = sda4, sdb4. What you appear to have is md3 has split into 2 single disk RAID1 sets; md3 & md127. Can you check the conf file or somehow other check how the RAID sets were built eg ask your Provider ? |
1. I haven't found the mdadm.conf in the /etc folder but in the /etc/mdadm/ folder but it doesn't say much :/
Quote:
2. The kernel is: Quote:
But i got it working with "lsb_release -a" : Quote:
|
2. actually, there's no '-' in my 'cat' cmd; its deliberate so that it usually works on most distros.
3. It certainly looks like it, given the other RAID arrays and their numbering. That's why its important to find out if they are 2 halves of the same RAID set, or if you've broken 2 sets. My money is on the former. Whoever built the sets should know... and you NEED to know before you try fixing anything. Incidentally, if you can avoid using those 2 and ideally unmount them, that should stop any further drift in content. |
Quote:
Do u have any other ways of finding out how everything looked like before the incident since the conf files dont provide anything for some reason :/ Would the provider know? EDIT: I might have found something that could support your assumption When typing in "mdadm --detail --scan >> /etc/mdadm/mdadm.conf" i get the following + 1 error message: mdadm.conf: Quote:
Quote:
Also here a screenshot of md3 details. Both md127 and md3 refer to the name "rescue:3" do u think that is a hint for a split? http://puu.sh/2dxdd |
EDIT: Sry i didn't want to spam that hard it seemed like i was lagging or the webpage so i may have accidently clicked post several times
|
EDIT: Sry i didn't want to spam that hard it seemed like i was lagging or the webpage so i may have accidently clicked post several times
|
I unmounted md127 to see what would happen, i restarted the server and the md127 entry was gone, also the md127 file in /dev/. The /home directory is empty now
Is that normal? Or did we screw up here? also we get this error: Quote:
EDIT: Sry i didn't want to spam that hard it seemed like i was lagging or the webpage so i may have accidently clicked post several times |
1. check the partitions again
Code:
cat /proc/mdstat Code:
fdisk -l 3. Do ask your provider how they set it up 4. hope you have a backup |
It seems like the md127 has reapeared after another restart but the home directory is still empty
1. I noticed that after typing the command md3 and md127 both say "(auto-read-only)" What does that mean? Quote:
Quote:
Quote:
2. fdisk -l gives the following: Quote:
Quote:
4. Hmm... More or less. We had most of our backups in the home directy which has been rollbacked for 5 months (i honestly dont understand why 5 months) and we only have backups that are 2-3 months old on external devices. We are kind of in problematic situation. EDIT: Btw if we can't manage to get the raids mounted again, do you know any way of extracting or bumping the content of a raid1 file to a directory or isn't this possible? We are planning on formatting the whole system IF we can get the files back |
Code:
mdadm -S /dev/md127 |
Ok i have done that now i get the following:
Quote:
|
after editing fstab do a
Code:
mount -a |
THis is what i get for mount -a:
Quote:
Here the dmesg tail: Quote:
md3 details give this: Quote:
|
try fsck /dev/md3
|
All times are GMT -5. The time now is 09:18 PM. |