[SOLVED] Raid Repair Now wont boot - Other mounting problems
Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
The could be related to /dev/md1, but it could also be that the system takes an inordinate amount of time to boot due to repeated errors reading the partition table of /dev/sdb. I wouldn't give up just yet.
However, if at any time an mdadm command was entered that could be interpreted as "resync the array using data from /dev/sdbx", all bets are off.
Is there no possibility for out-of-band management or remote access to this server?
The could be related to /dev/md1, but it could also be that the system takes an inordinate amount of time to boot due to repeated errors reading the partition table of /dev/sdb. I wouldn't give up just yet.
However, if at any time an mdadm command was entered that could be interpreted as "resync the array using data from /dev/sdbx", all bets are off.
Is there no possibility for out-of-band management or remote access to this server?
This is a dedicated Server from OVH. I have been using a netboot-rescue mode to do what im doing as of now. I just dont understand Y i cant mount a drive like I used to be able to. What do you mean by consol.. the rescue mode is a net boot that whats i get... I can boot to other kernals over the netboot aswell but never come online
This is a dedicated Server from OVH. I have been using a netboot-rescue mode to do what im doing as of now. I just dont understand Y i cant mount a drive like I used to be able to.
Clearly, the file system has been damaged. Exactly how that happened, is another matter.
When the 2nd drive failed, the md subsystem correctly booted /dev/sdb3 out of /dev/md3. Then you added it back, which is unfortunate, but by itself that shouldn't have caused data corruption. Unless, as I've said, something convinced md that /dev/sdb3 rather than /dev/sda3 contained the authoritative part of the mirror set.
What puzzles me is that md didn't also remove /dev/sdb1 from /dev/md1. OK, so the defective sectors of /dev/sdb may not have been located in the area occupied by /dev/sdb1, but then why the boot problems?
If you can boot the server using netboot-rescue (PXE?), you should run fsck on /dev/md1. If you can repair the boot partition, the server should boot as normal and you can proceed trying to fix /dev/md3.
OK when i try to do the fsck on md1 (boot area) i get
root@rescue:~# fsck -fc /dev/md1
fsck from util-linux-ng 2.17.2
fsck: fsck.swap: not found
fsck: Error 2 while executing fsck.swap for /dev/md1
same for md3
OK, this is potentially a major issue.
fsck tries to auto-detect the file system, and concludes that this must be a swap partition. I don't know how broken a filesystem has to be for fsck to reach that conclusion, but my guess is that the damage to the superblock must be pretty severe.
You need to think long and hard about whether you have something valuable on /dev/md1 or not, and if it can be recovered from backups. If you do and you don't have a backup, go no further. Otherwise, you should proceed as outlined below.
You can force fsck to treat a partition as containing a certain file system, and that's what you need to do here. If your filesystem was/is ext4, run fsck -t ext4 /dev/md1. Whatever you do, do not specify the wrong filesystem.
I think i Might have declared the entire drive or sda3 or sdb3 a swap partition with some swapon -a thing but still shat shouldnt format the data there? it was an ext3 partition.
Are you kidding? How does one accidentally declare two data partitions as swap space?
mkswap most certainly writes data to the drive, potentially destroying the superblock. Fortunately for you, there are multiple copies of the superblock spread across the disk. Also, unless you also "accidentally" activated the partitions with swapon, nothing else was written to the partitions.
I don't think you have much to lose by doing an fsck on /dev/md1, but again, only you know that it contains (or used to contain).
Edit: I misread "swapon" as "mkswap". Sorry, your partitions are irreparably damaged. Time to dig out the backups.
Oh, and BTW: RAID is not a form of backup. It's an insurance against the inevitable failure of hard drives. It does not protect you from any other kind of potentially data-destroying event, of which there are legion.
I ran dumpe2fs /dev/md1 | grep superblock and I got this output Which i assume is good.
Code:
root@rescue:~# dumpe2fs /dev/md1 | grep superblock
dumpe2fs 1.41.12 (17-May-2010)
Primary superblock at 0, Group descriptors at 1-3
Backup superblock at 32768, Group descriptors at 32769-32771
Backup superblock at 98304, Group descriptors at 98305-98307
Backup superblock at 163840, Group descriptors at 163841-163843
Backup superblock at 229376, Group descriptors at 229377-229379
Backup superblock at 294912, Group descriptors at 294913-294915
Backup superblock at 819200, Group descriptors at 819201-819203
Backup superblock at 884736, Group descriptors at 884737-884739
Backup superblock at 1605632, Group descriptors at 1605633-1605635
Backup superblock at 2654208, Group descriptors at 2654209-2654211
Backup superblock at 4096000, Group descriptors at 4096001-4096003
Backup superblock at 7962624, Group descriptors at 7962625-7962627
root@rescue:~#
Should I continue that guide? What is next. Im at my wits end
UPDATE following down the list on that page i used 819200 for my super block number and now I can mount the drive well md1
The command line switches "-p" and "-y" will make fsck.ext3 repair damage automatically, and assume "yes" to all questions. Needless to say, this is very dangerous.
Specify an alternate superblock with the -b parameter.
Edit: Like this: fsck.ext3 -b 7962624 -p -y /dev/md1
When i tried to run I get the below... Or should i first do the ext3. code u sent me then cancel it after a bit.
Code:
root@rescue:/mnt# dumpe2fs /dev/md3 | grep superblock
dumpe2fs 1.41.12 (17-May-2010)
dumpe2fs: Bad magic number in super-block while trying to open /dev/md3
Couldn't find valid filesystem superblock.
root@rescue:/mnt#
Could I use (sample code)
Quote:
$ dumpe2fs /dev/sda6
- got just the “Bad magic number in super-block while trying to open…” message
used
$ mke2fs -n /dev/sda6
got the 2nd super block same as in the example.
Used:
$ fsck -b 32768 /dev/sda6
to fix.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.