F18 kernel 3.8.2 upgrade hosed boot - dracut rescue mode only
I did the usual yum upgrade this morning. Upon reboot, I got as far as dracut's repair prompt. The journal ends with the same message that drops me into the dracut prompt - dracut can't find the root partition.
No system changes, previous upgrades worked, I can still boot into 3.8.1 using the rest of the kernel upgrade packages. Multiple attempts erasing/re-installing the kernel package led to the conclusion that something changed in the kernel install script. My x86_64 system remains the same as when I installed F17, first partition is boot with grub, second partition is a logical volume with seperate / and /home volumes. Dracut can no longer find my root logical volume, the one with everything except /boot and /home, so there must be some assumption that /usr is somehow accessible from the grub boot partition.
The 'Fedora (3.8.1-201.fc18.x86_64)' and 'Fedora (3.8.2-206.fc18.x86_64)' sections are identical; insmod lvm is still in the prologue.
Is there a way to get dracut to find the lvm group-volume containing the root partition?
Thank you for your help!
[The annoying 20-second delay getting a login prompt on a second tty is set aside for another day.]
Is this a normal "update" of the kernel
a "upgrade" from fedora 17 to 18
if upgrade , then what did you use
" preupgrade" or " fedup"
Thank you for your help.
`yum -y upgrade` == normal update. I choose the 'upgrade' option to turn on the -obsoletes switch. It's usually the first chore of the day, login as root to a tty console [default telinit 3], invoke the update, if no changes, mount any additional drives/partitions I'll need, then log in to tty2 (after an annoying 20-sec wait) as user and launch XFCE4, then see what's on my todo list.
The list of updated packages for 11Mar13 was:
Upon reboot, the default kernel-3.8.1 entry was highlighted and the kernel-3.8.2 entry was above it. I usually boot into the new kernel and erase the previous one. My guess is that whatever script creates the initrd after the file installs, /boot links, and grub update, neglected to include the root partition. There were no tools to view and mount any virtual volumes in the rescue mode, which was more focused on displaying the journal. /sbin/new-kernel-package awks for the root device in /etc/fstab, where it usually finds /dev/mapper/vg_[`cat /etc/hostname | cut -d . -f 1`]-lv_root / ext4 ... It then passes --args="root=$rootdevice... to grubby. I thought dracut was the varmint, since that is the prompt where the boot process ceases, unable to find the root device, but after poking around the POSTIN script, /sbin/new-kernel-package may have changed. It has been finding the correct root partition for over a year. The new-kernel file was updated 1/4/2013, but the kernel upgrade to 3.8.1 last month went without a hitch. One of these days, I'll decompress and diff the two initrd files.
Have you tried (from your 3.8.1 boot) running grub2-mkconfig -o /etc/grub2.cfg to create a new grub boot menu, ignoring the RH grubby tool's sometimes problematic intervention? I always do that after any kernel update. :)
so you uninstalled / disabled , i am guessing, "plymouth" gui boot
then use a text only boot up
manually mount drives , instead of having udev do it on boot
using udev rules or fstab ?
then manually launch xfce ( guessing xdm is set and xfce is set as default for "startx" )
A lot of extra work
/devsda1 /boot -- *
dev/sda2 / -- LVM
you can try , as PTrenholme posted , rebuilding grub.conf in grub2
fedora keeps 3 -- three ( might have moved to 5 ) old kernels as an emergency back up
-- and you WILL need them ( a 99% guaranty ) during the 13 month lifetime fedora
-- at least once .
My work is obviously different than other folks. I manually mount partitions because at any time I could have any of a dozen sata drives plugged in, depending on what I'm working on.
I don't want a gui boot. Root sometimes has other stuff to do in the console. Udev does not use the mount options I want.
My user login is also a gui. I have console chores to do there. Startx contains the "--nolisten tcp" server arg, don't know when some update is going to clobber that. Neither udev nor systemd can set up my HPC cluster via NFS shares when I need to do parallel programming.
After over a decade of habit, I invoked rpm -Uvh on the new kernel, now I can't get in and do anything. Grub2.cfg works just fine with both the old and new kernel menu entries identical (soft links in /boot helps). I erase the previous kernel if the current kernel boots and passes my test scripts. I usually roll my own, considering the gazillion modules. I don't want or need pcmcia, serial or parallel ports, samba, and all the other stuff that gets pulled in. Congenital indolence helps me to ignore all that.
I wrote my first program in FORTAN circa 1966, kept it on a roll of yellow tape. In the late eighties I taunted the Mac users by telling them about how they trained a gorilla at the Yerkes Primate Lab to communicate by touching symbols on a screen. But I digress.
Principal Support Systems Analyst [retired]
University of Arizona
-- current fedora 18 update
bypassing the "yum" package manager and the normal update operations
habits are hard to break but yum triggers the needed scripts to configure the system for a new kernel and gets any needed prerequisites
i would recommend using the std fedora 18 update repo
for manuall rpm install
-- you did blacklist the "kernel", "kernel-devel', and "kernel-headers" along with every Xorg package from being installed by the auto updates and yum ?
you did read the rpm prerequisites needed for"kernel-3.8.2-206.fc18.x86_64.rpm" and dl them also for manual install
For no internet connection - on an isolated box
find all and every needed prerequisite for the downloaded rpm and install them first
-- this dose turn a 5 min. job into a 2 day job
if you do not know fedora updates the kernel and/or xorg about once or twice a week
I resorted to the rpm command only after uninstalling and installing the kernel package using yum. Please read the first message. I know how often updates are pushed to fc18, I check every day. This instance was the first time I've ever seen the dracut journal prompt. When I examined the files from that prompt, the only fs module was btfs, no ext4. Lvm commands all yield not found (`lvm pgscan` `lvm lvscan` etc.). The problems is systemd-udevd. Udev does not pick up the root logical volume. The boot command uses UUID to find the root partition. Suddenly, what worked before stopped working. I've made no changes to the system. Oddly enough, I found a drive with F15 on it. Grub2 booted it without incident. /dev/mapper was populated. The only thing on this broken version that is in /dev/mapper is control. /sys shows seven loop devices, nothing else. Using a Linux From Scratch drive, /dev has a directory named vg_<hostname> that contains vg_root, vg_home, vg_swap, which all appear in /dev/mapper. There's about 40 devices in /dev/block. With systemd-udevd, none of those devices show up.
Problem traced to IBK (Idiot Behind Keyboard). I deleted all of the custom files in /etc/udev/rules.d, /etc/modprobe.d, and /lib/modprobe.d. Et Voilą! Also had to rebuild the rpmdb after yum and rpm got lost in the twilight zone upon invocation, granting me access to the keyboard only after I said the secret word: `ctl-z; kill -9 %1` Thank you everyone who tried to help me. You have mastered the art of patience when dealing with a complete idiot.
I can empathize with your desire to stick with what has worked for you. (I, too, wrote my first program on punched paper tape.) But there's some joy to be found in using new ways. Sometimes they're even better.
For example, the systemd replacement for init is really a vast improvement, but it took me a long time to find the systemadm tool that makes it understandable. (I was, in fact, writing a similar tool in gawk when I found it.)
Yet again, dracut can't load lvm when root is on a logical volume.
Obquote: "De gustibus non est disputandum" -(some Roman guy whose ancient works I had to recite in Latin class)
Personally, I don't turn on the computer, then sit there tapping the Enter key waiting for a prompt. I still don't know which parallel algorithm launches everything (openMP, MPI?). The kernel process table gets swamped, then each entry gets a time-slice in turn until they exit? Maybe the IO_WAIT states make it seem like things are zipping right along. Add 20 seconds to get a second console tty....
I have not come to praise systemd. I'm re-opening this thread because it happened again with today's kernel upgrade. Should the occasion arise, this is how I swatted that bug:
dracut --kver 3.8.5-201.fc18.x86_64 --force --show-modules --lvmconf --add lvm --add-drivers sata_nv initramfs-3.8.5-201.fc18.x86_64.imgYou don't need --show-modules unless you want to make sure the right modules get tossed into the initramfs. Probably can get away with just "lvm" since that's the root. It's a mystery to me, lvm and the sata drivers are compiled into the kernel. You'd think those would get copied into the tmp/modules, but it just stops after loading btrfs and 7 loops. On the bright side, I didn't cross the palm of Red Hat with gold for the privilege of fixing their code.
If you want to see what's in a dracut generated initrd, the lsinitrd command can be quite useful.
Did you check /etc/dracut.conf, /etc/dracut.conf.d/*, and /usr/lib/dracut/dracut.conf.d/* to see if you have, perhaps, excluded the lvm from the default dracut processing?
Why on earth would I do that?! The same configuration has worked for over a year of kernel upgrades. Someone changed something that hosed what once worked effortlessly. If some goof changed something that required me to go edit some configuration files that used to work flawlessly, courtesy demands giving prior notice, or at least an RFC.
OK, I'll play once more. /etc/dracut.conf was created February 5, 2013. It has both mdadmconf and lvmconf set to yes. I didn't install it. As for rpmnew/rpmsave, been there; done that. It's a cron.weekly job. At the top of the page, you may notice that it was the kernel upgrade to 3.8.2 that presented the hosed initramfs. On the date of the last time dracut.conf and dracut.conf.d/01-dist.conf,
2013FEB05 - the update installed kernel-3.7.5-201.fc18.x86_64; upon successful boot, I removed kernel-3.7.4-204.fc18.x86_64.
2013FEB09 - the update installed kernel-3.7.6-201.fc18.x86_64; upon successful boot, I removed kernel-3.7.5-201.fc18.x86_64.
2013FEB16 - the update installed kernel-3.7.7-201.fc18.x86_64; upon successful boot, I removed kernel-3.7.6-201.fc18.x86_64.
2013FEB18 - the update installed kernel-3.7.8-202.fc18.x86_64; upon successful boot, I removed kernel-3.7.7-201.fc18.x86_64
2013FEB21 - the update installed kernel-3.7.9-201.fc18.x86_64; upon successful boot, I removed kernel-3.7.8-202.fc18.x86_64.
2013FEB29 - the update installed kernel-3.7.9-205.fc18.x86_64; upon successful boot, I removed kernel-3.7.9-201.fc18.x86_64.
2013MAR03 - the update installed kernel-3.8.1-201.fc18.x86_64; upon successful boot, I removed kernel-3.7.9-205.fc18.x86_64.
2013MAR11 - the update installed kernel-3.8.2-206.fc18.x86_64; no boota dis, stuck in dracut with only btrfs loaded, no ext4, no lvm, no mdadmconf, no joy.
I finished the day using a distribution that has not presented any boot errors since 1994.
2013MAR12 - using kernel-3.8.1-201 that I did not remove yet, since 3.8.2 didn't boot, I wrote the first entry of this thread. I didn't have time to do a complete post-mortem, I mistakenly blamed anything I could think of that would produce such errant behaviour.
Note Bene: the same dracut confs that had been working through seven prior kernel upgrades remained in situ, as did everything else in /etc. Nothing of consequence was installed or upgraded between Mar03 and Mar11, some python and perl modules, yum, mesa, fftw, lapack, etc.
I cannot afford to devote any more non-billable hours to this matter.
|All times are GMT -5. The time now is 01:25 AM.|