ramdisk corrupted, only after booting with it (RHEL53)
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
ramdisk corrupted, only after booting with it (RHEL53)
I'm creating a ramdisk image (of a small custom root filesystem with
Red Hat Enterprise Linux 5.3) using genext2fs. I'm net booting that
image using PXE and using it as a permanent file system (rather than as
a temporary -- it never pivots to a hard drive boot).
That's been working fine for months (I thought), but I just found out that the ramdisk, after booting, has a small amount of corruption on it.
When I run fsck it shows about 10 i_block errors, one i_size error, and a multiply-referenced block. It's able to repair the errors, but they will be there again every time the ramdisk image reboots.
I've verified that the ramdisk image itself is not corrupt by mounting it and testing it. It always looks fine -- no corruption. But when I use it to boot, when the boot is finished the mounted root file system is corrupt (again, not totally corrupt, just containing these 12 or so repairable errors). This is true when booting normally via PXE, AND I also booted the image from the local disk using GRUB just to rule out any problems with the network boot (since the boot host is Solaris).
Either way, if it boots, it's corrupt. If it's mounted without boot (using mount -o loop), it's fine. So it seems there is something about the RHEL53 boot process that is causing the corruption.
This occurs on multiple machines, so I don't think it's hardware.
Here are the parameters I'm using for the PXE boot in pxelinux.cfg/default:
Could it be as simple as a mismatch between the ramdisk_size kernel argument and the actual size of the ramdisk? What is the size of the file? How did you arrive at 256000 as the specified size? Are there any undesirable symptoms, or is it benign, so far?
Thanks for the suggestion. The size of the ramdisk (262144000 bytes) matches the kernel argument (256000 KB).
Maybe it would behave better if I make it a round 256 MB. I can try that later today.
There are few symptoms yet (just a screenfull of error messages when my script calls ldconfig, but the ldconfig works okay), but having corruption doesn't bode well for the future (this is going to be running embedded and deployed far afield where I can't keep watch on it).
I'm not sure of the ramifications (pun somewhat intended) of not supplying a ramdisk_size argument, but I know that it is not always supplied. Perhaps try booting the kernel without that argument. I think it may be a matter of efficiency for the kernel to be advised of the size, but it will still figure out the size on it's own, and perhaps accurately enough that the errors go away (if it is indeed the problem).
Is it a custom ramdisk, or one provided with the distribution? If custom, how was it created? Do you have the possibility of loading the ramdisk & kernel with a different bootloader? Maybe even a different version of the PXE bootloader? My thinking is that since it is the bootloader that 'installs' the ramdisk, it may be part of the problem.
I tried removing the ramdisk_size from the pxelinux.cfg/default file, and it was not happy. Massive errors, trying to read past end of disk, and a kernel panic. So I put that back in.
We created the custom ramdisk with a long script that copies the essential pieces from a full install into a working directory, deletes unneeded stuff, adds in needed /usr/ files, creates the users and a bunch of other stuff, then runs genext2fs to make the working directory into a filesystem image.
I did try booting it both with PXE (network) and with GRUB (local), with the same result.
Hmm. Isn't a an init ramdisk normally in the form of a compressed cpio archive? From the description of genext2fs, it is creating an image with an ext2 filesystem embedded. I don't know what distinction this creates for the kernel, but it sounds like the two entities must be different. In fact I'm a bit surprised it works at all when I think about it. I suppose if the kernel has a built-in ext2 driver module, it would understand the image, but perhaps something else doesn't know how to deal with ext2 in RAM. Sorry to be so imprecise; I'm kind of thinking out loud here.
--- rod.
Good catch. Yes, I think the kernel looks in the ramdisk and sees it's an ext2 filesystem and does have a driver for that. But it's definitely worth a try to create the ramdisk as a cpio archive instead. Maybe there are bugs in that driver, or in some executables that are (perhaps) trying to access the filesystem in a non-standard way.
Do you know of a good tool to generate a cpio archive? I'm about to google it, of course.
The kernel can handle both an initial ramdisk and initial ramfs. It can get either a disk image filesystem from the bootloader (creates a ramdisk) or a compressed cpio image (creates a ramfs). The latter is the preferred method and is often used even with the wrong terminology attached (e.g. they say "ramdisk" and you get "ramfs"). Additionally, the kernel can be built with the compressed cpio image built-in. If both a kernel initramfs image and a boot loader provided initramfs image are there, they will be merged (been there, done that, a couple years ago) before the "/linuxrc" program is run (handy to tweak files otherwise built into the kernel initramfs image without having to rebuild a kernel every time, when doing testing).
The initramfs approach, with a compressed cpio image, is the preferred method these days. It has fewer issues (size does not needed to be coordinated) and more flexibility. It's the way to go for any new projects.
The kernel build uses a special cpio tool that uses a control file to specify what files go into the image, where to get them from, and what their metadata is (how a non-root user can create a root-owned file inside that image). It also has ways to create device nodes directly into the image. It is used for making the cpio image that gets built into the kernel (which, BTW, is always there, even if empty, and the kernel always goes through the motions to extract it to the initial ramfs). Grab the latest kernel source tarball and extract it. Files to read include:
If you want to dig into the kernel logic in how it handles this and starts the user space, change into the "init" subdirectory and read those source files.
I think you can easily roll your own. I do it by creating an on-disk directory structure containing all of the components I want, and then:
Code:
#! /bin/sh
#
#
# Example usage:
# trmkinitrd initrd-2.6.10
#
# A complementary script 'truninitrd' performs the
# reverse operation of this script.
#
# $Author:$
# $Date:$
#
# $Revision:$
#
# ======================================================
if [ $# -eq 0 -o ! -d $1 ]; then
echo "Initrd builder"
echo ""
echo "Run this script by specifying the name of a "
echo "directory containg initrd image data as the first argument."
echo "Usage:"
echo "trmkinitrd initrd.2.x.y"
exit 1
fi
img=$1
echo "Creating cpio style image ${img}.img"
find $1 | cpio --create --format='newc' > ${img}.img
echo "gzip compressing ${img}.img"
gzip ${img}.img
mv ${img}.img.gz ${img}.img
# echo "copying to tftp server root"
# sudo cp ${img}.img /tftpboot
echo "Complete"
The 'uninitrd' script:
Code:
#! /bin/sh
#
#
# Example usage:
# truninitrd /boot/initrd-2.6.10.img
#
# A complementary script 'trmkinitrd' performs the
# reverse operation of this script.
#
# $Author:$
# $Date:$
#
# $Revision:$
#
# ======================================================
if [ $# -eq 0 ]; then
echo "Initrd unroller"
echo ""
echo "Run this script by specifying the name of a "
echo "compressed initrd image as the first argument."
echo "The source image will be unrolled into a directory"
echo "structure in the present working directory. "
echo "The source image will be left unmodified."
echo "May require root privileges to create dev/ directories."
echo "Usage:"
echo "truninitrd /boot/initrd.2.x.y-zzzz.img"
exit 1
fi
img=`basename $1 .img`
gzimg=${img}.gz
echo "Preparing directory structure ${img}"
if [ ! -d $img ]; then
mkdir $img
echo "New base directory $img created"
else
rm -rf $img/*
echo "Removed old directory contents"
fi
echo "Uncompressing image"
cp $1 ${img}/$gzimg
cd $img
gunzip $gzimg
echo "De-archiving cpio archive image"
cpio -i < $img
echo "Complete"
Thank you both for the info!
I'm not sure whether the kernel I'm using
(2.6.18-128.el5PAE) can mount a cpio "ramdisk" as a file system (ramfs?). Maybe I'd have to rebuild the kernel with ramfs support?
I tried making the initrd as you suggested, Rod, but when I boot it I get:
...
md: ... autorun DONE.
No filesystem could mount root, tried ext2 iso9660
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(1,0)
I'm not making a temporary initrd, I'm booting, mounting this as my root filesystem, and running with it indefinitely, never switching to a hard disk filesystem.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.