LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   Crash destroys bootability from any device (https://www.linuxquestions.org/questions/slackware-14/crash-destroys-bootability-from-any-device-734080/)

svar 06-19-2009 12:59 AM

Crash destroys bootability from any device
 
I was perhaps unwisely playing an avi file from a USB device. it had some distortions until it froze the system completely. The only thing I could do is unplug the power.(I then realized that the USB stick had gotten unusually hot) Upon rebooting, sure enough the filesystem (reiser 3.6.19) was corrupted. I thought I was safe anyway, not just because I could run reiserfsck, but also because I had 3 linux systems(another older slack and a Suse)
on my machine(so I could choose at the lilo prompt to enter either, then mount the other partitions and try to repair). Well, that did not work.
It does not even recognize the main/usual(that's the one I was working from) slack partition /dev/sda2
reiserfsck did not fix the main partition and I was unable to boot anything else either. I also changed the boot order and tried to boot with a Slax and Knoppix CD. Both reported kernel panic and got stuck.
Incidentally I was able to copy some files on another usb disk


Some of the messages:

reiserfsck --check
Replaying journal
No transactions found
Zero bit found inode in bitmap the last balid bit
checking in tree
Bad root block 0 (--rebuild-true did not complete)
....

The item has the bad pointer which is in tree -already zeroed

Unable to handle kernel paging request ....
pputing eip
...


It looks like a h/w problem to me, but the fact that I canot boot from the live cds either indicates that it's not the HDs. Any idea what it could be or what I should check????

ChrisAbela 06-19-2009 01:34 AM

Perhaps you can unplug the faulty HD and boot via a Live CD first. Then you might have to see if you can recover the HD later.

svar 06-19-2009 03:55 AM

Why is unplugging the HD necessary?
I changed the boot order to CD first, so it should not even see any HD!
Besides, there are 3 HDs and only one of them was mounted when the crash occured.

GazL 06-19-2009 04:39 AM

reiserfs has a reputation for being fast, but it also has a bit of a reputation that when things go wrong, they can go wrong spectacularly. I think it was something to do with the way it chains metadata together, meaning that in the event of an unclean dismount, rather than lose an open file or two, you can lose an entire swathe of the disk structure. I've always avoided it for this reason.

Skaperen 06-19-2009 04:56 AM

I've never had a corruption from reiserfs on account of an unclean shutdown. And this is over several machines using reiserfs for most things (/etc /home /var) over several years. I have had a several cases of corruption due to I/O errors on the drives. It seems if writes fail, reiserfs just keeps on going as if the write succeeded. Then there is corruption because what was written wasn't, and what was there before is still there and will new be interpreted as the data that should have been written. All but one of these were with USB devices. USB is a terrible design that leads to many failure modes. If you're lucky, a USB error will leave the device unavailable and the data damage will be limited. But it tends to reset the device and re-establish it often at the same device nice, and reiserfs doesn't notice that the device went away and came back and that I/O attempts in between are lost.

svar 06-19-2009 04:57 AM

I still don't see what this has to do with not being able to boot ANY disk, or even the live CDs!

mlangdn 06-19-2009 05:08 AM

I would suspect the ram going south. Or, its possible that your power supply could be involved as well. I would unplug everything but the minimum for a boot, then restart adding one thing at a time until I hit the culprit.

GazL 06-19-2009 05:39 AM

Have you tried booting from the slackware install cd that matches the version of your main slackware partition and trying to run your reiserfsck from that manually?

That would show whether you can atleast boot a system and if it's the fsck that's triggering the panic.

AGer 06-19-2009 06:15 AM

Look into BIOS for inspiration. If it provides none, unplug whatever possible, connect CD to a previously unused controller, if any, and try to boot a live CD. You know, if the USB stick got hot, then some high current passed through it and, consequently, through the MB. The prognosis is bad.

svar 06-19-2009 06:37 AM

Thanks all. Incidentally, I checked the hot usb on a different system and it works fine now(it's no longer hot, of course). Even the avis that were playing at the time play fine now. I will try reiserfsck form the slack install disk first. Unsure why this should work when the live cds did not, but if it does, maybe something can be saved

Franklin 06-19-2009 06:41 AM

It does sound a little like it could be a hardware problem, but there seems to file corruption either way. Have you checked the BIOS to verify that your harddrives and other integrated or add-on cards are being recognized and identified correctly? There have been times (not many) when I needed to remove all the hardware (ram, hdds, video, sound, network card, etc) and reinstall them one piece at a time to figure which might be the offending piece. Start with ram and move to video then harddisk. Having spent years cobbling together free but questionable hardware to allow my kids to have PCs I've seen my share of weird behavior.

[quote]
I also changed the boot order and tried to boot with a Slax and Knoppix CD. Both reported kernel panic and got stuck.
[quote/]

Not sure what you mean here. Were you able to boot Knoppix itself? Did you get a panic when you tried to direct Knoppix or the Slackware CD1 to boot a known root partition or did the Live CD fail to boot itself. It must be the former as I can't figure how a live CD would panic.

Assuming you can boot to Knoppix (and there are no obvious BIOS or hardware issues from above), are you able to see/mount any partition on any of your harddrives? You might also want to run the Slackware installer and run a cfdisk to see what the partition tables look like for your disks and verify that all you partitions and disks are seen by cfdisk and correctly listed with the expected filesystem.

svar 06-19-2009 06:49 AM

No, I cannot boot to either slax or Knoppix. Boot starts and goes along some time, but I never get a functioning slax or knoppix. This is what seems very weird . The kernel panics refer to booting the live CDs.

GazL 06-19-2009 07:19 AM

Quote:

Originally Posted by svar (Post 3579508)
No, I cannot boot to either slax or Knoppix. Boot starts and goes along some time, but I never get a functioning slax or knoppix. This is what seems very weird . The kernel panics refer to booting the live CDs.

It's possible that the knoppix or slax boot scripts are detecting your damaged reiserfs partition and are hanging because of it. Maybe because of unrepairable corruption, or possibly even a version incompatibility between their reiser implementation and slackware's. That's why I suggested booting the appropriate version of the Slackware CD and trying a manual fsck and mount.

P.S. Some models of USB stick are known to get a little on the hot side during sustained access.

svar 06-19-2009 01:17 PM

I was able to get a prompt from the Slack install disk. I can mount the partitions, e.g. mount /dev/hdb2 /mnt/root or mount /dev/sda2 /mnt/root.
ls shows the directory structures, so I bet I can save the files. However, doing
reiserfsck --check /mnt/root fails:
bread: Cannot read the block(2) (is a directory)
reiserfs-oper bread failed reading block 2

Same for block 16

reiserfs-open: the reiserfs superblock cannot be found on /mnt/root
Failed to open the filesystem. If the partition table has not been changed, and the partition is valid and it really contains a reiserfs partition, the superblock is corrupted and you need to run this utility with --rebuild-sb

So when I do that, I get

bread: Cannot read the block(2) (is a directory)
reiserfs-oper bread failed reading block 2

Same for block 16

reiserfs-open: the reiserfs superblock cannot be found on /mnt/root
rebuiilds_sb: cannot open device /mnt/root


---------------------------

So I am not sure what is the best thing to do now:
Since I have some unused partitions (I like to try new distributions and kernel versions and I was waiting for Slack 13), I can install Slack 12 on the unused partition , copy everything I need there, then
try to recover the old partitions; if that works fine, if not, then
a brand new installation over them would also be ok.
I don't think this will create any problems since my partitions are
unusable as they are right now, but if there is a hw problem,
I'd rather find out and replace the disks than install fresh on bad disks.

Any suggestions?

Skaperen 06-19-2009 02:10 PM

Quote:

Originally Posted by svar (Post 3579884)
I was able to get a prompt from the Slack install disk. I can mount the partitions, e.g. mount /dev/hdb2 /mnt/root or mount /dev/sda2 /mnt/root.
ls shows the directory structures, so I bet I can save the files. However, doing
reiserfsck --check /mnt/root fails:
bread: Cannot read the block(2) (is a directory)
reiserfs-oper bread failed reading block 2

You need to run reiserfsck on the device node of the filesystem (e.g. /dev/hdb2) instead of the directory of the mount point. This needs to be done when the filesystem is NOT mounted. So be sure to unmount it before running reiserfsck or just don't mount it after a reboot.

Code:

reiserfsck --check /dev/hdb2


All times are GMT -5. The time now is 04:19 AM.