LinuxQuestions.org - Cloned drive fails to boot

- Red Hat (https://www.linuxquestions.org/questions/red-hat-31/)

- - Cloned drive fails to boot (https://www.linuxquestions.org/questions/red-hat-31/cloned-drive-fails-to-boot-938299/)

njstol

04-05-2012 10:04 AM

Cloned drive fails to boot

Greetings, I'm new here.....:cool:

I recently cloned a failing hard drive using systemrescueCD. It was a RH 5.3 installation on a Dell T3500. Since the computer was still within its 3 yr warranty, Dell sent me the same size drive. After using ddrescue to copy the old drive to the new, looking at the two via GParted, the block start/stop/sizes, descriptions, etc of the 5 partitions are all identical (except the new drive is slightly bigger leaving some unallocated blocks at the end) I can mount the individual partitions, including VolGroup00 were my data resides, everything looks perfect from that pov. But the cloned drive does not boot!
Here's what happens when I powerup...

GRUB and the boot process does start normally, and I'll get up to the line of

Red Hat nash version 5.1.19.6 starting

And after about 20 sec. then spits out the errors;

INIT: cannot execute "/etc/X11/prefdm"
INIT: Id "x" respawning too fast: disabling for 5 minutes
(this line repeated multiple times)

and sits there...it will repeat these lines if I do let it sit 5 minutes. Or if I hit the enter key a few times I will get the following login which doesn't do anything

RedHat Enterprise Client 5.3 (Tikanga)
Kernel 2.6.18-128.el5 on an x86_64
(none) login:

ie, I can type in root, but it doesn't ask for a password, it just returns the same login line.

If I interrupt the initial boot to get the GRUB menu and append the kernel line to add a 1 so to start in single user mode, then I get followed by the errors:

Red Hat nash version 5.1.19.6 starting
Reading all physical volumes. This may take awhile....
Found volume group "VolGroup00" using metadata type lvm2
2 logical volume(s) in volume group "VolGroup00" now active

INIT: cannot execute /etc/rc.d/rc.sysinit
INIT: entering runlevel 1
INIT: cannot execute /etc/rc.d/rc
INIT: no more processes left in this runlevel

(hitting enter in this case does not give me the login line)

Any ideas whats going on?:scratch:

jefro

04-05-2012 08:06 PM

Well, I can't say this is a great idea. "cloned a failing hard" . What expectation would you have based on the data. You really ought to work with known good data.

My guess is you system was set with some other means to identify the drives. You may think /dev/sda or such but the system like boot or grub may say maxor-388239;00sd09f0-r or some example of disk by id. (I know bad example)

If you have grub then boot and see what it says when you try to edit the line.

syg00

04-05-2012 08:46 PM

Well it's obviously booting, but have you fsck'd the filesystem(s) ?.
Absolutely a requirement after ddrescue. Do the fsck from a liveCD - one that that has LVM support built in.

PTrenholme

04-05-2012 10:46 PM

You might try entering run level 3 instead of 1. In single user mode, the Unix System Resource (/usr) directory may not be mounted, so the programs in there may not be available. (That, I think, is not the case if you don't have /usr in its own partition, but, in any case, level 3 - if you know your "root" password - is usually easier to work with.)

A work of caution: I once managed to kill a new Fedora installation by running fsck on the physical partition that contained the logical volume group that the Fedora project had decided was a "good idea" for the new release. If you want to run fsck on a LV, run fsck /dev/mapper/VG/LV, and (since you have the System Rescue CD available) run it from there after activating the LV, rather than trying to do it from within the cloned drive. (I thought that the documentation for ddrescue cautioned that fsck should be run on the image created. :scratch:)

Note: After using the VG/LV defaults for several Fedora releases, I now just put each new Fedora release in a single partition. (I have a simple mirrored two drive system, so, for my needs, the LV program is real overkill. I understand that Red Hat wants to test the VG/LV stuff on as many different systems as possible, but I'm having enough "fun" using btrfs for one of my Fedora partitions.)

njstol

04-06-2012 08:52 AM

Thanks for the replies!

Indeed I did fsck (while in systemrescue) all partions, and the linux partition (/dev/sda3)if I remember correctly could only be checked as /dev/VolGroup00/LogVol00....and it did find many errors on that partition. And GRUB defines root=/dev/VolGroup00/LogVol00,
which is how it is define on another identical computer

In any event, I really do need to get this computer operational since its the host for an important instrument here in Chemistry where I work (and that is the reasoning in cloning a failing drive in the first place - not only to recover the data, but the instrument's software package and parameters as well)and I have spent too much down time trying to get this going. I do have a DVD of the iso for RHEL5.4 which I think I'm going to give a shot. I'm hoping it will only do an upgrade of the system (it does say "to install or upgrade" as a choice when started) both repairing it and leaving the important directories intact. If not, I was able to make tar backups of various directories on the clone's /dev/VolGroup00/LogVOl00 partition, which all looks fine and dandy, so I can get back all my parameters in the event 5.4 does a clean install instead. I'll let you know the outcome....
Thanks again!

njstol

04-06-2012 10:40 AM

OK, so I took the plunge and started the RHEL5.4 installation-

The RHEL 5.4 installation does indeed give me the choice to only upgrade, and it found the 5.3 version and started, but quickly gave me the error:

Error enabling swap device VolGroup00/LogVol01: invalid argument
The /etc/fstab on your upgrade partition does not reference a valid swap partition

Upon OK'ing the error message the system bebooted and I entered rescue mode and checked the fstab and it specifies on its last line:

/dev/VolGroup00/LogVol01 swap swap defaults 0 0

and the fstab entrys are the same on my other system operating under RHEL5.3
So is the problem is that I don't have LV01 setup properly?
As noted above, when attempted to boot into single user mode, it does find two LV in VG00, so do I need somehow define my swap space?

PTrenholme

04-06-2012 01:32 PM

You may not need any swap space. If you're not running lots of processes and have a reasonable amount of memory, swap is seldom even used. And, if it is used, your system will not be very responsive. In any case, if the swap space is allocated but unusable, just reformat it as swap space again: it very unlikely that the old swapped pages - if any - would be of any use in a freshly installed system.:)

On my system, with 4 64-bit processors and 12Gb RAM and 3Tb in two HDs (an HP p6710f I got for ~$500), I allocated 24Gb as swap (the customary "twice RAM" amount), but the swap space has never been used except for one time I (inadvertently) hibernated the system.

njstol

04-06-2012 02:03 PM

I did a lvscan and it found two active LV:
ACTIVE '/dev/VG00/LV00' [222.66GB] inherit
ACTIVE '/dev/VG00/LV01' [3.91GB] inherit

I'm thinking that those 4GB is awfully large and maybe that the LV01 was actually assigned to the excess space at the end of the drive, which I think is unallocated.

so I'm thinking of doing the following-

#swapoff /dev/VG00/LV01 (turns out swap was already off)
#lvresize -L 1000M /dev/VG00/LV01 (resizes to 1 GB...maybe too small afterall?)
#mkswap /dev/VG00/LV01
#swapon /dev/VG00/LV01
#cat /proc/swaps # free (which confirmed it was there and unused)

Rebooted and viola - it make it passed this point and is currently upgrading (although is taking its time reloading the bootloader)

Edit: damn, its seems to be stuck on loading the same file now for 10 min....not good :mad:

njstol

04-06-2012 02:49 PM

30 minutes gone by and still supposedly installing the same 451kb file (rsyslog-2.0.6-1.el5.x86_64)
Not good!

Is there anyway to abort the installation- short of pushing the power off? (I hate to think what that might do?)

EDIT: well....I powered off the computer, and rebooted and went back into the upgrade option and tried it again, this time choosing the second bootloader option of not to do anything with the bootloader (instead of recommended choice updating it)....it loaded the image to disk, installed a bunch of files and paused briefly at the above file, but then proceeded and claimed the upgrade was successful and to reboot. When doing so I did notice some message about an error and failed install which went off the top of the screen quickly, and the system rebooted into the old 5.3 version and stopped as before in my very first post above, as if nothing had changed

I did check the upgrade.log generated in /root and it was 5MB of errors such as ;
error:db 4 error (-30977) from db->get: DB_RUNRECOVERY: Fatal error, run database recovery
error: error (-30977) getting "gshadow" records from Basenames index
and over and over on probably for every file possible!

SO it looks like I'm going to spend my Saturday doing the clean install of 5.4.....that should work, right?
:banghead:

jefro

04-06-2012 07:52 PM

NIT: cannot execute /etc/rc.d/rc.sysinit

This tells me some setting some place can't find that file. It is not booting at all.

njstol

04-11-2012 09:11 PM

Quote:

Originally Posted by njstol (Post 4646666)

SO it looks like I'm going to spend my Saturday doing the clean install of 5.4.....that should work, right?
:banghead:

The Clean install of RHEL 5.4 on the linux partion went fine and got the computer end of the system up and running.
I tar'd back the /home directory from the old drive, and found the old passwd file and made all my user accounts with the same UIDs and their old data is all recognized as their own. The system software installation went easier then expected as creating a symbolic link to the original software directory copies the important files upon reinstalling the new software into different directory. So I'm up and running. Thanks to all who gave their suggestions here.

Now on to fixing another boot problem with a different computer which was actually caused (indirectly) by this problem.
I guess I'll post that elsewhere in this forum as its probably not a RedHat specific problem

All times are GMT -5. The time now is 10:14 PM.