Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I recently purchased a Dell Precision 530 workstation and installed WhiteBox Ent. Linux 4 distribution (RHEL4 variant). Since I have two SCSI drives, I installed the system using software RAID. Plus, I'm running LVM. Everything has been working fine for approximately 6 weeks - the system has been rebooted several times without problems. However, when I attempt to boot now, the system runs through the SCSI adapter checking all of the LUNs (which is normal) and then I get the "error loading operating system" error. The only system changes have been kernel mods required for Oracle.
I am able to boot from the linux cd (linux rescue) and the operating system on the hard disk(s) is found. I do have a separate /boot partition, but can't remember if grub was required during the install. When the system was booting properly, I did get the option to choose which kernel I wanted to boot from, but I don't know if that was part of grub or just the OS.
I've googled this to death, but have mostly found issues related to dual-booting with XP. Since I'm using SCSI drives, RAID, and LVM, I haven't come close to finding issues related to my situation. I'm ready to re-install, but I'd really like to figure this out instead of caving in.
'Error loading operating system' comes from BIOS. This means your GRUB is gone from MBR. I think booting up using rescue disk and rewriting your MBR would solve your problem. Reinstalling OS is really not necessary.
Ok. When I boot from the rescue cd and chroot /mnt/sysimage, I can see all of the files in /boot/grub and that all looks good: grub.conf is linked to menu.lst, etc. After noodling around some more, I found a similar issue which led me to the following commands:
Keep in mind that the system has only 2 scsi drives, sda & sdb and there are 2 raid 1 partitions. A 'df' shows that /boot is mounted on the raid device /dev/md1. /dev/md0 appears to be the raid device for all of the logical volumes that make up the rest of the system. I've tried to run 'grub-install' on /dev/md0, /dev/sda, /dev/sda1, etc., but it always returns with "/dev/md1 does not have any corresponding BIOS drive".
Is grub-install the correct command to use? I did not make a backup of the MBR as I've seen done in some of my findings. Not sure what else to try.
I tried to copy the MBR from another linux server of mine:
From good server:
Copied the MBR
# dd if=/dev/hda of=/tmp/hdambr bs=512 count=1
This created a backup of the MBR in /tmp/hdambr. I then copied this file over to the 'troubled' linux server. First, I attempted to restore just the MBR which is contained in the first 466 bytes:
# dd if=/root/hdambr of=/dev/md0 bs=1 count=466
Then, I tried to reboot, but it still didn't recognize the MBR. The partition table is located in bytes 467 - 510, and the last two bytes are used to mark the MBR. I'm not really sure what this is, but it's needed. So, I then tried to copy just these two bytes over:
I tried rebooting, but nothing. I'm not sure if this command would have actually put the two bytes (511, 512) in the correct positions on /dev/md0. The skip portion is telling dd where to skip on input, but I'm not sure where the "of" (/dev/md0) would have written the two bytes. I then tried to copy over the entire 512 bytes of the MBR and attempted to update the partition table, but no luck. So, I then decided to re-load the OS which recognized all of the RAID partitions and LVM vols, so I decided against re-formatting them in hopes that laying down the new MBR would take care of everything. It didn't. I then reformatted both SCSI drives via the SCSI BIOS and then re-loaded the OS. Everything is working good now. (I should have formated the drives before installing the OS the first time. Lesson learned.)
Check the partition table. See that it is correct. Windows operating systems require the "lba" flag.
Once you reboot from your rescue disk, you can use a grub command to rewrite the master-boot-record (MBR). The message you're getting comes from the BIOS, which cannot find a bootable system. (Have you checked the BIOS settings to see where it is trying to boot from?)