Pitfalls of Slackware
I make this post with great regret. Slackware has always been my number 1 favorite distrobution hands down. All of my nix servers in my home environment run Slack. I even run some systems at my work with Slack (the ones I have the liberty to choose my OS). I know there is a majority of slackers here that feel the same.
So, I'm sure you guys will come to understand my astonishment and frustration when I came across a situation when my beloved distro failed to surmount the situation I came across.
There is a server at my work (Intel-Xeon Supermicro), that has centos 5.5 64 installed. There was some corruption in the rc.sysinit file that caused the system to boot in read-only mode. Although I knew it would probably be a good idea to boot into the centos recovery environment, I decided to pull out my slack disk instead. Besides, all I had to do was activate the underlying VG, mount it, and edit the file. How hard could this be?
Apparently, much more difficult than it looked. First of all, Intel's Matrix Storage Manager ROM (imsm), was loaded on the system. There was a Raid1 via the imsm container, and a LVM2 VG under that. So, when I boot into the system via Slack13.37, via the huge kernel, I thought this would be easy as pie.
shows the appropriate raid1 between /dev/sda1 and /dev/sda2
An assemble scan wasn't neccessary, although just for kicks I stopped the raid1 and restarted it:
mdadm --stop /dev/md126
mdadm --assemble --scan
which restarted the array with the appropriate imsm container. Mdadm could even detect the metadata correctly, when I ran:
I could clearly see I was on Intel Matrix Storage Manger ver 220.127.116.112.
So, after running:
vgchange VolGroup00 -a y
and activating VolGroup00, all I left had to do was mount the appropriate LVM.
mount /dev/mapper/VolGroup00-LogVol00 /mnt/0
which, unfortunately, would fail misererably with a segmentation fault at md.c:6/
You may ask, why I didn't just attempt to mount the lvm after stopping the raid array? Well, I thought the same thing. This is a raid1 after all, right? the data on /dev/sda1 is the same on /dev/sdb1. Well, after stopping /dev/md126, and running:
i'm sure you guys would understand my surprise when both commands responded with no pvs / no vgs on the system. After poking around in the Intel Matrix Storage Manager ROM, I discovered that encryption was enabled on the R1 array. Stopping the Raid may break the filesystem, or the LV may only be transparent when the array is intact.
I was able to fix my conundrum, although I had to swallow my slack pride and load centos5.8_64, and boot linux rescue and allow rescue to auto mount the lvm.
My only assumption, is that there is an incompatibility issue with with mdadm between Slack13.37 and centos 5.5.
Slack 13.37 ships with mdadm 3.1.5, which is pretty damn new (2011 release), considering that mdadm's currently on version 3.2. This version also supports metadata format IMSM.
It looks like centos5.5 doesn't even attempt to assemble to IMSM Raid1 volume. Centos5.5 ships with mdadm v2.6.9, which I don't belive had support for IMSM metadata format. Furthermore, after booting into the fixed CentOS5.5 install on the Intel Supermicro, I discovered centos 5.5 doesn't even touch the IMSM raid1 volume! There is no raid volume according to /proc/mdstat!
So, I can't fault slack here. I know Patrick did the smart thing bundling mdadm3.1.5 with 13.37. Also, there could some kernel issues that I'm overlooking. I know redhat/centos follows the model of, going with a longterm-stable kernel (2.6.18.x), and takes all the updates and backports them to the old kernels. I'd have to scour through the centos/redhat logs, but maybe they got an update from intel which allows them to read IMSM encrypted filesystems. Or, it may just be that mdadm v2.6.9 can't see IMSM at all, which is what allows centos to see the lvm.
Anyways, this whole rant is from one sysadmin to another, as a forewarning, always make sure you have plenty of ISO's at your disposal. And don't be afraid to use another distro, even when it isn't your favorite. In the corporate world, I've yet to come across a client that purposely requests slack on thier production/dev environment. Although, you better believe the sysadmins who know there guns, will be running it the background, and you won't even know it :)
Just keep away from vendor fake RAIDs (like Intel one) and have plenty of backups, which you can restore in case of file corruption.
Well, don't blame slack if you missed the fact the drives were encrypted. :)
Just a heads up, 13.37 is missing a couple of mdadm tools in the default isntall disks that it needs when working with Intel matrix Raid. I found that out the hard way. It's an easy fix, and I believe has been corrected in -current, but it slipped through before the original slack distribution dvd's were made (if you have the comemrcial ones).
Filesystem data formats/structures can and do change between versions of tools and kernels.
As Mr. Mackey would say:
Unfortunately, its going to be hard for me to retest using the same environment. My priority was restoring the system, and since that has been taken care of the supermicro is back in production. Although for the sake of determining the root issue, I would need to find a platform with native support with IMSM ROM, which off the top of my head, may include ICH7 platforms.. I know for a fact that the Intel C600 platforms have support, but those are there newest platforms, which I don't know when I'll have the capital to go and purchase one.
After doing some more research, I discovered you need at least mdadm-3.2 to fully support IMSM. After poking around in the mdadm git logs, I see there's a bunch of fixes specifically for imsm that went into 3.2. (http://neil.brown.name/git?p=mdadm;a...tags/mdadm-3.2)
Still, Gazl I'm inclined to disagree. These are all gnu tools were using. I had a ext3 filesystem on top of a lvm2 volume that I needed to mount momentarily to change a file. The only thing that stood in the way was this IMSM container.
Recall, once I did boot the centos5.5 supermicro system, I discovered centos doesn't even touch the IMSM raid volume. It simply allows IMSM to do the raid in the background, and boot right up. I don't think the IMSM "encrypted volumes" is a big issue either, because centos can see the volume just fine.
When I do have the opportunity to test a platform with a IMSM ROM, I'll test a IMSM raid1 LVM two ways:
- slack with mdadm v2.6.9 (same as centos5.5) (which I'll have to look at the changelogs for, I may have to go back to slack9 or 10, or do a LFS)
- slack with mdadm v3.2 (reccomended for IMSM)
So NyteOwl, your right. I can't blame slack here. I just need the right tools for the job.
I use mdadm with RAID 0 and Intel Matrix Storage Manager. Until version 13.37 of Slackware, the IMSM metadata was not supported. Prior to that I was using dmraid and I could only get the 32-bit version of that to work with Slackware.
Keep in mind mdadm is a configuration front end for other parts of the OS such as the LVM. That means in addition to having a mdadm version that understands the metadata, the other OS drivers and services must support the features used by the RAID array. Make sure the kernel has mirroring, striping, encryption or whatever else is required. Also, udev is necessary to create devices. If you want to boot from RAID you have to enable udev support in the initrd.
I hope that some other metadata formats are added to mdadm, or at least there is some provision made for user-defined metadata formats. Although my computer has Intel RAID, I am also using computers with Promise RAID. So far as I can tell, dmraid is not being updated, except for some bug fixes in various distros.
Slackware has done a very good job of improving the RAID support, but unfortunately there hasn't been much software available to support fake hardware RAID in Linux. I was quite surprised when mdadm added support for Intel RAID. I wouldn't want to make any bets on whether newer Intel RAID formats will be supported in the future. It's probably safer to stick with software RAID in Linux and not use the fake hardware RAID. Certainly Intel RAID is a better bet than others, but remember it was no better supported in Linux than any others until very recently. It has pretty much been dmraid or hope for a Linux driver from the RAID vendor.
I don't see how this is Slackware's fault, it all depends on mdadm. You didn't realize at the time what the different version numbers would translate to in terms of how they behave. This is NOT a pitfall of Slackware.
|All times are GMT -5. The time now is 04:37 AM.|