new bedrock, boot-time errors
I decided to attempt a second bedrock install. I have been running a nice working bedrock for months: this new one is an experiment.
Anyway, the main differences between it and the old one are (a) it is not a hijack install (b) it uses Crux instead of Slackware as the first strata. I decided to make global its own strata. So global is not a sym link to some other strata. The strata for the rootfs is named "nyla". The partition for this second bedrock is on an lvm volume. Betterinitramfs is used to deal with the LUKS and LVM bits at boot-time. The initrd part of the boot process works, indicating that the kernel and initrd parameters to the kernel are correct. After the screen is cleared and the bedrock logo is displayed, I see the following lines (note, typed by hand from a phone picture, so the potential for inaccuracies exists): Code:
ln: /etc/mtab: File exists Code:
[OKAY] pivot_root'ing to crux (init) I'm lost at this point. I've gone through the directories and the various config files while booted in the working bedrock and can't find what is wrong. Can anyone give me a clue? Thank you. p.s. Installing a second bedrock system from a first bedrock system can be a bit confusing: there is already a /bedrock directory, in-use. ;) |
CRUX's init does a fsck check at boot, which is apparently not happy. I don't know if it's unhappy because of something specifically bedrock related, or if it just doesn't like how you formatted things when you set this up, or if your harddrive is failing and the Bedrock Linux specifics are a red-herring.
The usual, proper fix for these errors is to boot off of some other partition/device and use fsck to fix the filesystem. There's quite a few guides out there on this. I'd recommend trying that. Another, horrible option would be to comment out the fsck check. If you boot off something else then mount this system, you should find a file at `<mount-point>/bedrock/strata/crux/etc/rc` which contains a line something like `/sbin/fsck $FORCEFSCK -A -T -C -a`. You could comment that line out, then under it put "true". I don't necessarily recommend this; better to make fsck happy then bypass it. You should also probably try the fallback init. This is what it's there for. If that works fine, we'll learn that this might be CRUX specific, but if that also fails, it could be unrelated to CRUX. Maybe install another distro as a stratum along side CRUX and try that one, see if it works. The `realpath` errors and `force-symlinks` warning are concerning. Could be the same underlying issue, or it could be another unrelated issue. I'm not sure how to investigate the `force-symlink` issue - your system is just in a weird state, no idea how it got there - but the `realpath` errors may be investigate-able. If you boot off some other system and mount this system, is `<mount-point>/bedrock/strata/crux` a normal directory (versus, for example, a symlink)? Does `<mount-point>/bedrock/libexec/busybox realpath <mount-point>/bedrock/strata/crux` run as expected, or does it produce a no-such-file error? |
Hi ParadigmComplex. I followed the non-hijack instructions, and I get the same realpath errors at boot. The difference here is that everything seems to work perfectly fine: I can select the init stratum and go happily on, so honestly I didn't bother to further inspect the issue and then forgot to report it.
|
I've not been able to thoroughly check this since, but I have done the following...
I checked the partition from another Linux on the same machine (from my other (the one that works fine) bedrock) and no problems were found. Using the Fallback init & strata was a good idea: that enabled the OS to finish booting with less errors and I was able get to a login without a pause. I was only able to log in as brroot because of errors about being unable to execute /bin/brsh. I know I did that setcap when I installed. This should not happen! Once logged in, lsblk shows the partition mounted at /bedrock/strata/nyla. That should be /. Is it normal for the root partition to be mounted or remounted to odd places during bedrock boot? edit: mounted the problem system on the /mnt/new_bedrock mount point from the working system, got this: Code:
~$ ls -l /mnt/new_bedrock/bedrock/strata/ |
EmaRsk:
Thanks for letting me know. I'll rule that out then as irrelevant to jr_bob_dobb's issue, and investigate it separately. If things work with those errors, it's probably just a cosmetic thing and is likely harmless; if they were meaningful, it wouldn't progress without obvious follow up issues. Still, I should figure that out and resolve it. jr_bob_dobbs: I have no idea why CRUX's fsck would be upset if another Linux install's fsck is happy with it. I find that a bit concerning. Ignoring CRUX's fsck's warning about the filesystem being broken seems inadvisable, but I don't see a direct path to follow here to tackle it. Since it's apparent you've got *some* other issue based on the fact you had to log in as brroot, perhaps we should tackle those first and then revisit the fsck issue - maybe we'll get lucky and following the other paths will hit on the underlying issue here. Regarding "unable to execute /bin/brsh" - did you mean "/bedrock/bin/brsh"? If not, then I suspect you've made a transcription mistake when following the instructions. Logging in with fallback/brroot or booting off the other Bedrock Linux install and mounting, then editing the broken installs /etc/passwd and changing the ":/bin/brsh" line(s) to ":/bedrock/bin/brsh" - that is, adding the "/bedrock" - should fix it. If that was just a transcription mistake from the error message, and the error was about "/bedrock/bin/brsh" - then that's an issue we'll have to dig in deeper to figure out. Maybe log in as brroot then run /bedrock/bin/brsh - see if that works. If it gives you errors and fails to drop you into a new shell, you could run Code:
/bedrock/bin/brsh 2>&1 >/bedrock/brsh-errors Since you were able to log in with fallback/brroot, see if you can run Code:
/bedrock/bin/brr -f /bedrock/brr-log It's normal for the all the mount stuff to get rearranged at boot. It's key to how Bedrock Linux lets you pick whichever init you want (amongst other features). I expect that lsblk output is fine. You've had a working Bedrock Linux install before, if I understood your original post. Maybe this will make some intuitive sense from prior experience with how Bedrock Linux works: to get to the root a given stratum's tree, you'd go to /bedrock/strata/stratum-name. So to get to the root of your partition's filesystem - ignoring Bedrock Linux's mechanisms to let you see different things in different contexts - you'd go to /bedrock/strata/rootfs-stratum, where rootfs-stratum is dependent on your specific setup. For your apparently broken install discussed here, that rootfs stratum is nyla (which is the convention for a 1.0beta2 Nyla manual install). In fact, "rootfs" is aliased to nyla such /bedrock/strata/rootfs will take you to your original root (once things work, anyways). Having lsblk report /bedrock/strata/nyla as your root of the filesystem makes sense then, hopefully. Your /bedrock/strata directory output looks good. Per EmaRsk's comment, seems it was a red herring. Looking into /bedrock/bin/brsh vs /bin/brsh and getting the /bedrock/bin/brr output seem like good next steps. |
D'oh! I had put /bin/brsh in the .passwd and it was supposed to be /bedrock/bin/brsh. Now regular root and bob can log in, but still only from the fallback strata.
I ran Code:
/bedrock/bin/brr -f /bedrock/brr-log Two other oddities, though they may be a clues... One. As bob I get this error on login: Code:
/bedrock/bin/brsh: line 63: can't create /dev/null: Permission denied Code:
whoami ; pwd ; ls -la .brsh.conf Code:
bob Two: only the slash partition and the swap are mounted. I'd previously set the fstab to mount the partitions of windows and the other distros, yet they are not there. I can manually mount them, after logging in as root, so not sure what the cause of the problem could be. I am using an lvm over luks, but that worked with crux (non-bedrock) and with my other (working) bedrock. Maybe the only reason that the slash partition is mounted at all is because of the initrd? |
brr grabs a lot, in the hopes that it'll get whatever is needed for a wide variety of issues. If you can't make it an attachment of sorts here, can you upload it elsewhere and link? There's lots of websites that host snippets of text; take your pick.
You can't log in as bob because the permissions on /dev/null are wrong. The fallback init is fairly minimal and may not be smart enough to do things like fix /dev/null's permissions. Try Code:
chmod 666 /dev/null The fallback init should try to mount the fstab contents (you should see something to the effect of "[OKAY] Preparing /etc/fstab filesystems" when booting with it). However, if some other setting is broken it's not unlikely that the /etc/fstab file isn't being distributed properly. Hopefully I'll be able to investigate this possibility after looking at your brr output. |
Problem. The file with the brr output is corrupt. Over 1000 control characters and over 1000 high-bit characters. I think I'm going to call this a disk error, and have a thorough badblocks check run overnight.
EDIT: False alarm. brr appends a copy of the time zone file, which is binary. D'oh! |
brr dumps your /bedrock/etc/localtime file which often has the high bit on in many bytes, as it's not an ASCII file. That's not an indication that the log is corrupt or that you have disk errors (although the fsck errors earlier are most definitely an indicate you may have disk problems). If whatever paste service you're using doesn't like control characters, search the log for "--- /bedrock/etc/localtime ---" and remove the up through the next "--- [text] ---" section (or the end of the file if there's no following section). I'm doubtful your timezone configuration is relevant to the issue here; that's probably safe to remove without hampering my ability to investigate what's going on.
I should probably remove /bedrock/etc/localtime from brr, or just grab the last line (which is normally ASCII, and is the only bit I usually care about when looking at that file for debug purposes). I don't think you're the first one to be tripped up by that. |
OK, got the file uploaded...
http://filebin.ca/2zWotz75pHAL/brr-log.txt.xz Since it counts as binary anyway, I xz'd it. |
I don't see anything unusual in there. Outside of a few red herrings we came across on the way here, your Bedrock Linux setup looks correct to me. Provided you did use fsck to check the filesystem from another distro correctly, and it was happy there, I have no idea why CRUX's fsck isn't happy. I'm inclined to think it isn't Bedrock Linux related/specific. Might be an issue with either CRUX's /sbin/fsck or the /sbin/fsck you used to check the filesystem outside of CRUX.
If you want to hack around it, you can edit /bedrocks/strata/crux/etc/rc and comment out the "Check filesystems" block (looks like lines 29-49 to me, but I'm not sure my on-hand version of CRUX is up-to-date) That'll have CRUX boot past that check. However, if there is merit to the error CRUX is throwing there, it won't actually be fixed and might bite you later. Otherwise, you could boot with another distro's init - one with a /sbin/fsck that is happy about your filesystem. Just grab it as a stratum and use it instead. |
Quote:
Quote:
Quote:
From the menu I selected default (crux's) init. Just before enabling the strata, this (warning, typed from a phone screen-shot, so errors are possible) error message was displayed: Code:
[ -- ] Enabling cruxforce-symlinks warning for crux: A non-symlink file or directory exists at both "/var/lib/dbus/machine-id" and "/etc/machine-id". Should exist at "/etc/machine-id" with symlink at "/var/lib/dbus/machine-id" pointing to it, instead. Code:
INIT: no more processes left in this runlevel Quote:
|
Quote:
Quote:
Quote:
The very top of the strata setup instructions page recommends going through the tweaks page after installation. In retrospect that's poor design, no one's going to read that part. In the future I may rework things to lessen the possibility of this kind of thing arising. The brg feature of the upcoming release should help a lot in this respect as well, as it'll automate the specifics here away for many use cases. Quote:
|
I feel like a ninny whenever I miss something in the documentation. D'oh!
Anyway, I made those changes to the bedrock-crux. Its shutdown (or reboot) is now fine. Whew! I took another look at the fsck call in the rc script. Code:
/sbin/fsck $FORCEFSCK -A -T -C -a I looked up those options: -A # check everything in fstab. Hmmm, what about things that don't exist yet? -T # don't show the title on startup -C # display (text-mode) progress bar -a # automatically repair without pausing to prompt After some thought, I changed the fsck call to: Code:
/sbin/fsck $FORCEFSCK -A -C -a -V -s -M -V # verbose output -s # check only one fsck at a time Taking out -T and adding -V was done for diagnostic reasons. -s probably does not make a difference but I knew it would make me reading the text easier at boot-time. I then booted into the bedrock-crux system. That ran fine! I suspect that what was making fsck complain before was the fact that the root file system was already mounted as read-write. So now, the system can run fsck on boot, as is proper. While the system appears to be working, there are two things that give me an uneasy feeling: One: A run of "lsblk": Code:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT Two: If I type "mount" with no parameters in the bedrock-crux, I get the following output: Code:
devtmpfs on /bedrock/strata/global/dev type devtmpfs (rw,nosuid,relatime,size=8192k,nr_inodes=16384,mode=755) Compare that to what I get in my regular bedrock: Code:
/dev/mapper/cryptvg-dawn on / type ext4 (rw) |
Apologies for taking a bit to get back to you; I've been busy, and I didn't want to rush another response about "lsblk" only to leave you again concerned about it next time you take a look.
Quote:
Quote:
Quote:
Nonetheless, I should investigate the possibility of having it be read-only at that point in boot, either always (if we can expect all init systems to be happy with that) or conditionally per init option. Bedrock Linux needs the filesystem to be read-write just before handing things off to the specified init, but there's no reason it couldn't remount things back to read-only. The mount and lsblk stuff look fine. I'll see if I can repeat/rephrase/expand on things from the doc to alleviate your concerns here: Two pieces of software "conflict" if they both require different values for a given resource. Examples:
Bedrock Linux manages to get around this problem for some kinds of resources (namely files at a given file path) by allowing you to have multiple copies (that is, you can have multiple different files at the exact same file path) then ensuring the right copy shows up when a given piece of software tries to use it. If processes don't have mutual exclusive requirements for a given resource, they'll usually see the same thing (which is needed for everything to work together), but if they do require different requirements they'll each see what they need. Sadly Bedrock Linux cannot do this for all resources; for many, you can only have one of at a time even on a Bedrock Linux system. For example, you can only have one version of a kernel module loaded at a time - if you need two conflicting versions at the same time, Bedrock Linux won't solve that for you any better than any other distro. The Bedrock Linux lingo for these kinds of resources would be to describe them as "singletons". The main problem with singletons is that not only can you only have one instance at once, but it also restricts anything dependent on that singleton. While Bedrock Linux can't have multiple instances of a given singleton in use at a time (by definition of a singleton), it does try to let you pick which instance of a singleton is in use at a given time, and let you switch which you're using. So you can unload a given version of a kernel module and load another version. What you're seeing with lsblk and mount is a side effect of the "different processes may see different things for certain resources where it's usually not necessary for them to see the same thing to work together" and "you can only have one instance of certain resources". Bedrock Linux can't have every process see the exact same mount table, as they may have mutually exclusive requirements there. For example, if you had one of your strata not on a real/local disk but mounted via a network mount, and another stratum on a real disk - what would they each see as the root mount point? A local disk or a network mount? Simplest solution there is to have them see different things. Luckily, for the vast majority of things seeing the same root filesystem mount point (or other mount points) is not required for interactivity - just the same files at a given file path. The init system - PID1 and company - are responsible for unmounting the mount points on a typical system. It then needs to actually see all the mount points, as if it can't see some of them it can't unmount those ones. Both PID1 and the entire mount tree are Singletons; only one stratum can provide/have each at a given time. Since PID1 has to see the entire mount tree, Bedrock Linux ensures that's the stratum that sees the whole mount tree. Thus mount and lsblk output can be very different for different strata, and are dependent on things like which stratum is providing init or which was your root filesystem before Bedrock Linux did it rearranging. For me, for example: Code:
$ bri -a init # which stratum provides init/PID1? lsblk is essentially picking one of the many places your root filesystem device happens to be mounted. In some sense there's multiple right answers. There's no reason to be concerned that it didn't pick the one you naturally lean towards, or that it'd differ with different strata or installs. That make sense? Maybe I should make a FAQ entry about weird mount tables on Bedrock Linux. |
All times are GMT -5. The time now is 10:05 PM. |