LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 07-28-2017, 04:48 AM   #1
bloupbloup
LQ Newbie
 
Registered: Jul 2017
Posts: 13

Rep: Reputation: Disabled
Server didn't boot - boot partition on RAID 1


My web hosting company installed two RAID:
a mdadm RAID1 with the /boot copied across 4 partitions: sda2 / sdb2 / sdc2 / sdd2
a mdadm RAID 5 accross 4 partitions on sda3 /sdb3:/sdc3/sdd3

sda started to get more and more bad sectors and I created a ticket. My web hosting company hot swapped the hard drive. The two raid have been restored and ended up in clean state.

Then, i rebooted the server and the grub bootloader didn't start. (nada nothing).

So, i ran GRML (rescue debian linux), i went to /etc/fstab and i replaced the UUID mapped to the boot by the second boot partition from the second drive /dev/sdb2 /boot. Still, there were no difference, there were no boot. I reverted it back to original fstab.

So, I tried to rewrite the grub with grub2-install and grub2-mkconfig but there were no difference. It didn't boot

I have asked the web hosting company to reinstall their clean install. Then, i have done a mdadm --fail /dev/md0 /dev/sda2 to simulate once again the boot partition on the failing hard drive. The server rebooted successfully

Then, i have added mdadm --fail /dev/md0 /dev/sda3 to simulate the whole failing hard drive and the server rebooted too.

Then i have changed /etc/fstab once again to /dev/sdb2 /boot and it didn't boot, i guess it was because it was a linux raid member.

I wonder if after the hardrive was replaced were there a problem with initramfs? If somebody has any idea about it. Is it possible that the UUID change of the hard drive had some bad consequences?

Do you think that RAID1 and boot don't get along well and that only hardware RAID is the right way to do it?

Last edited by bloupbloup; 08-02-2017 at 04:04 AM.
 
Old 07-28-2017, 05:21 AM   #2
TenTenths
Senior Member
 
Registered: Aug 2011
Location: Dublin
Distribution: Centos 5 / 6 / 7
Posts: 3,475

Rep: Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553Reputation: 1553
Quote:
Originally Posted by bloupbloup View Post
Do you think that RAID1 and boot don't get along well and that only hardware RAID is the right way to do it?
I had similar problems with a hosting company, personally I don't use software raid on any level for things. Proper hardware RAID for me every time.
 
1 members found this post helpful.
Old 07-28-2017, 05:35 AM   #3
bloupbloup
LQ Newbie
 
Registered: Jul 2017
Posts: 13

Original Poster
Rep: Reputation: Disabled
On the hosting company side, after it didn't boot, they didn't feel concerned about it even if they provided the settings. It showed that the 4 mirrored boot partitions were totally useless.
Plus, the benefit of the LVM is nowhere to be seen.
 
Old 07-28-2017, 05:48 AM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,125

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
I've never used a hosting company ... so value this how you will.

mdadm is for partitions - and is fine for the purpose. Arguably better than hardware RAID for commodity disk. Grub plays with it just fine - if configured properly.
The issue with (BIOS/MBR) disks is that the boot record in the MBR is not handled by the RAID replication. It has to be generated properly for each disk in the RAID1 set. Your provider appears not to have bothered - or tested.
And yes, the initrd also has to have enough smarts to continue booting with a degraded RAID. Not rocket science, but if grub isn't setup properly, the initrd never gets control.
 
Old 07-28-2017, 09:47 AM   #5
Pearlseattle
Member
 
Registered: Aug 2007
Location: Zurich, Switzerland
Distribution: Gentoo
Posts: 999

Rep: Reputation: 142Reputation: 142
I do use a hosting company, I always used raid1 for the boot partition and the server always booted even after multiple HDD failures.
Maybe I misunderstood something - in any case here are my thoughts:

1)
Quote:
mdadm RAID1 with the /boot copied across 4 partitions: sda2 / sdb2 / sdc2 / sdd2
You would not "copy" boot to those partitions.
You would just create e.g. a "/dev/md2" RAID1 (which uses sda2 / sdb2 / sdc2 / sdd2) and your "/dev/md2" would be mounted on the "/boot" directory => anything that you put in there (e.g. grub.cfg, grub libraries, kernel, initrd, etc...) would be mirrored automatically accross all 4 HDDs.

2)
Quote:
i went to /etc/fstab and i replaced the UUID mapped to the boot by the second boot partition from the second drive...
In "/etc/fstab" you should not point to any HDD partition, but to the RAID1 that you have created and which is mounted on the "/boot" directory - in this example it would point to "/dev/md2".

3)
You would run "grub2-install" 4 times, once against each of the 4 disks (e.g. "grub2-install /dev/sda" + "grub2-install /dev/sdb" + ...) => this way each disk has a MBR and it won't matter which disk is chosen for the boot by the BIOS of the server.

4)
Not 100% sure if this is absolutely necessary, but I have always done it this way: A) compiled the RAID-modules in the kernel (not in initrd) and B) I set the grub config to load the kernel with the parameter "domdadm", which forces the kernel to search & assemble all RAIDs at a very early stage.

Cheers

EDIT:
You have to create at least the raid1 of the boot partition (don't remember if that's needed as well for the root part) using v0.90 metadata.

Last edited by Pearlseattle; 07-28-2017 at 09:51 AM.
 
1 members found this post helpful.
Old 07-29-2017, 02:30 PM   #6
bloupbloup
LQ Newbie
 
Registered: Jul 2017
Posts: 13

Original Poster
Rep: Reputation: Disabled
If somebody just replaced a drive, i think they should rebuild the grub2 before rebooting.
grub2-install
grub2-mkconfig
and also rebuild the initramfs image
dracut -f

After a reboot it is more complicated from a rescue OS because the arrays have to be mounted and then you should be in chroot.


1/ (continued) check that the GRUB is installed in the MBR. The following command line reads the first sector.
dd bs=512 count=1 if=/dev/sda 2>/dev/null | strings
GRUB should be displayed when you run this command line. if a drive is missing grub, it should be reinstalled using grub2-install /dev/***

2/ Check the /boot/grub2/grub.cfg. In this file, you will find the UUID of the array
compare those UUID with the command line blkid.
in blkid, you will fine the same uuid for the drives in the same array.
in grub.cfg, This uuid should be found after --hint parameter The UUID should match. if it does not match in the grub.cfg, you can modify it because if the UUID is wrong, you server will not boot.

there is a second UUID after --hint is the UUID for the raid array that is mounted on the /boot.

3/ On top of that, in the initramfs image in /boot, there is a /etc/mdadm.conf with 2 UUID s which should match with the /dev/md* arrays using blkid.
you have to unpack the image to a folder to access the mdadm.conf details.

4/ In my case, i can see only 2 explanations why it didn't boot. 1/ GRUB was not in the first sector of the drive and there was no boot sequence in the bios. 2/ a UUID of an array has been modified somewhere after drive replacement.

Last edited by bloupbloup; 07-30-2017 at 07:08 AM.
 
Old 08-02-2017, 03:55 AM   #7
bloupbloup
LQ Newbie
 
Registered: Jul 2017
Posts: 13

Original Poster
Rep: Reputation: Disabled
solution

I have found the solution using a VM and it is really simple.

I have just copied the 512 octets of the second drive (512 octets = the MBR) to the first drive and it booted even when the /boot partition of the first drive was deleted.
dd if=/dev/sdb of=/dev/sda bs=512 count=1

Then, after i rebooted to the system i have typed:
mdadm --add /dev/mdx /dev/sda2 (to re add the boot partition to the array
Then, i have added the drive to RAID 5:
mdadm --add /dev/mdx /dev/sda3

after sync

grub2-install /dev/sda
and
grub2-mkconfig


This shows how it is important to backup your MBR.

I think that many bios of servers have no boot order for all the SATA drives that are installed. The bios tried to boot on the first drive but since there is no Boot information in the MBR after replacing the drive, it does not work.

It is a mistake to modify the FSTAB because the UUID from the FSTAB aren't the UUID from the drives but it is the UUID of the RAID arrays.

Last edited by bloupbloup; 08-02-2017 at 04:07 AM.
 
Old 08-02-2017, 04:36 AM   #8
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,125

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
To truly test your theory, you need to zero out /boot, not merely delete it.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Is it possible to boot from a hard disk having a raid partition ? daudiam Linux - Software 1 03-28-2010 07:02 AM
RAID partition not showing on boot xtravar Linux - Hardware 1 03-11-2010 09:31 PM
Initrd boot phase errors reading bogus partition on raid drive. charlweed Linux - Hardware 0 10-26-2006 01:36 PM
Raid-Dual boot-Swap partition trouble monsm Ubuntu 7 08-15-2006 06:58 AM
GRUB: How to boot WinXP (NTLDR, NTDETECT.COM & BOOT.INI) from boot partition (EXT2) ? Rayen16 Linux - Software 1 05-25-2006 12:09 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 11:50 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration