LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   Need to find source of hard crash that caused LILO damage (https://www.linuxquestions.org/questions/linux-hardware-18/need-to-find-source-of-hard-crash-that-caused-lilo-damage-140724/)

tigerflag 01-31-2004 11:57 AM

Need to find source of hard crash that caused LILO damage
 
I built a computer for someone and have been running it for a week until she picks it up, just to be sure it all works OK. She's never used Linux so I gave her Mandrake 9.1, since I've known it to be very stable.

I just experienced a HARD system crash, and worse. Keyboard and mouse totally froze. I had to hit the reset button to turn it off and reboot. When it rebooted, LILO gave a second-stage error message "LI 99 99 99..." indicating the boot map was corrupted in some way. I could only boot up using an emergency boot floppy. Once up, I ran LILO from the command line, rebooted again, and everything "seems" fine again.

I used components identical or similar to my own box because they've always been reliable. Nothing cutting-edge, external hardware serial modem, PCI soundcard, PS2 mouse and keyboard, Antec True Power 400 Watt PS, running VERY cool...
LILO is configured with acpi=off.

At the time it crashed, I was in X (KDE) and online (KPPP), using Opera and playing .wav files from the harddrive using XMMS. That's ALL that was running. I have aRts setup to NOT run at startup, since that's the only way to get sound right in certain games (oddly enough though, the login music always plays when I start KDE.)
XMMS only works with the OSS sound driver, although I use my PCI soundcard with the default ALSA drivers.

I really want to know what happened so I can prevent it from happening again. Is it a problem with XMMS, or Supermount, or the setpci command (see below)? This woman has hardly used a computer and I want it to be as trouble-free as possible for her. Here's some info but please tell me if I need to post something else. Thanks for taking the time to read this. Siri Amrit

Mandrake 9.1 using kernel 2.4.21-0.13mdk and ReiserFS.
Duron 1.4 processor
PC-Chips M811LU MoBo with 256 MB Crucial PC-2100 RAM.

Here's the output from lspci:

[root@localhost norma]# lspci
00:00.0 Host bridge: VIA Technologies, Inc. VT8366/A/7 [Apollo KT266/A/333]
00:01.0 PCI bridge: VIA Technologies, Inc. VT8366/A/7 [Apollo KT266/A/333 AGP]
00:0c.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
00:10.0 USB Controller: VIA Technologies, Inc. USB (rev 80)
00:10.1 USB Controller: VIA Technologies, Inc. USB (rev 80)
00:10.2 USB Controller: VIA Technologies, Inc. USB (rev 80)
00:10.3 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 82)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8235 ISA Bridge
00:11.1 IDE interface: VIA Technologies, Inc. VT82C586/B/686A/B PIPC Bus Master IDE (rev 06)
01:00.0 VGA compatible controller: nVidia Corporation NV17 [GeForce4 MX 440] (rev a3)

Here's the /etc/fstab:

/dev/hda1 / reiserfs notail 1 1
none /dev/pts devpts mode=0620 0 0
/dev/hda5 /home reiserfs notail 1 2
none /mnt/cdrom supermount dev=/dev/scd0,fs=auto,ro,--,iocharset=iso8859-1,codepage=850,umask=0 0 0
none /mnt/floppy supermount dev=/dev/fd0,fs=auto,--,iocharset=iso8859-1,sync,codepage=850,umask=0 0 0
none /proc proc defaults 0 0
/dev/hda2 swap swap defaults 0 0
/dev/scsi/host0/bus0/target0/lun0/cd /mnt/cdrecorder auto ro,noauto,user,exec 0 0


FWIW, I put the following Athlon/Duron Power-Saving command in /etc/rc.d/rc.local:

#Athlon/Duron Power-Saving with KT266A Chipset:

setpci -v -H1 -s 0:0.0 92=$(printf %x $((0x$(setpci -H1 -s 0:0.0 92) | 0x80)))

setpci -v -H1 -s 0:0.0 95=$(printf %x $((0x$(setpci -H1 -s 0:0.0 95) | 0x02)))

At the time it crashed, the CPU was running at 25 degrees C (less than 75F.)

Here are some things I found in /var/log/syslog just before the system crashed:

Jan 31 08:20:33 localhost kernel: DROPPED IN=ppp0 OUT= MAC= SRC=216.203.251.26 DST=216.203.252.210 LEN=48 TOS=0x00 PREC=0x00 TTL=125 ID=8867 DF PROTO=TCP SPT=3327 DPT=135 SEQ=1395750106 ACK=0 WINDOW=8160 RES=0x00 SYN URGP=0 OPT (0204055001010402)
Jan 31 08:20:33 localhost kernel: ABORTED IN=ppp0 OUT= MAC= SRC=64.73.96.21 DST=216.203.252.210 LEN=40 TOS=0x00 PREC=0x00 TTL=46 ID=2675 DF PROTO=TCP SPT=80 DPT=1321 SEQ=212465311 ACK=0 WINDOW=24840 RES=0x00 RST URGP=0
Jan 31 08:20:34 localhost kernel: sr0: CDROM not ready. Make sure there is a disc in the drive.
Jan 31 08:20:57 localhost last message repeated 23 times
Jan 31 08:20:57 localhost kernel: DROPPED IN=ppp0 OUT= MAC= SRC=216.200.6.3 DST=216.203.252.210 LEN=92 TOS=0x00 PREC=0x00 TTL=114 ID=28556 PROTO=ICMP TYPE=8 CODE=0 ID=512 SEQ=13008
Jan 31 08:20:58 localhost kernel: sr0: CDROM not ready. Make sure there is a disc in the drive.
Jan 31 08:21:29 localhost last message repeated 31 times
Jan 31 08:21:32 localhost last message repeated 3 times
Jan 31 08:21:32 localhost kernel: DROPPED IN=ppp0 OUT= MAC= SRC=216.205.78.234 DST=216.203.252.210 LEN=92 TOS=0x00 PREC=0x00 TTL=112 ID=13528 PROTO=ICMP TYPE=8 CODE=0 ID=512 SEQ=20968
Jan 31 08:21:33 localhost kernel: sr0: CDROM not ready. Make sure there is a disc in the drive.
Jan 31 08:22:04 localhost last message repeated 31 times
Jan 31 08:23:05 localhost last message repeated 61 times
Jan 31 08:23:37 localhost last message repeated 32 times
Jan 31 08:23:37 localhost kernel: ABORTED IN=ppp0 OUT= MAC= SRC=64.254.14.170 DST=216.203.252.210 LEN=40 TOS=0x00 PREC=0x00 TTL=41 ID=0 DF PROTO=TCP SPT=80 DPT=1340 SEQ=3123938492 ACK=0 WINDOW=0 RES=0x00 RST URGP=0
Jan 31 08:23:38 localhost kernel: sr0: CDROM not ready. Make sure there is a disc in the drive.
Jan 31 08:23:39 localhost kernel: sr0: CDROM not ready. Make sure there is a disc in the drive.
Jan 31 08:23:39 localhost kernel: ABORTED IN=ppp0 OUT= MAC= SRC=64.254.14.170 DST=216.203.252.210 LEN=40 TOS=0x00 PREC=0x00 TTL=41 ID=0 DF PROTO=TCP SPT=80 DPT=1340 SEQ=3123938493 ACK=0 WINDOW=0 RES=0x00 RST URGP=0
Jan 31 08:23:40 localhost kernel: sr0: CDROM not ready. Make sure there is a disc in the drive.
Jan 31 08:24:11 localhost last message repeated 31 times
Jan 31 08:25:12 localhost last message repeated 61 times
Jan 31 08:26:13 localhost last message repeated 61 times
Jan 31 08:27:14 localhost last message repeated 61 times
Jan 31 08:28:15 localhost last message repeated 61 times
Jan 31 08:28:26 localhost last message repeated 11 times
Jan 31 08:28:26 localhost kernel: DROPPED IN=ppp0 OUT= MAC= SRC=216.205.87.164 DST=216.203.252.210 LEN=92 TOS=0x00 PREC=0x00 TTL=112 ID=47018 PROTO=ICMP TYPE=8 CODE=0 ID=512 SEQ=44802
Jan 31 08:28:27 localhost kernel: sr0: CDROM not ready. Make sure there is a disc in the drive.
Jan 31 08:28:58 localhost last message repeated 31 times

spuzzzzzzz 01-31-2004 07:25 PM

I'm afraid i can't help with the crash, but:
do you mean OSS drivers for XMMS (not OGG)?
if you do, there is an ALSA plugin for XMMS that gives you the option of using ALSA for output.

spuzzzzzzz 01-31-2004 10:18 PM

btw, if you have a corrupted boot map, that suggests that you're mounting your /boot directory at boot time. Don't. Put the /boot directory in a separate partition (if you haven't already done so) and add the option "noauto" in /etc/fstab. This will protect your boot directory from damage.

tigerflag 02-01-2004 12:28 AM

You're right, I meant OSS. I'll edit that, and get the ALSA plugin.

I don't have a separate /boot partition, only / and /home. /boot is inside /. Could you explain the steps for doing this a little more fully, please?I've never made new partitions and moved contents into them on a running system before; have only done it when installing a new system. I have read the man pages for lilo and lilo.conf but it's still confusing.

Here's /etc/lilo.conf:

boot=/dev/hda
map=/boot/map
vga=normal
default="LinuxMandrake"
keytable=/boot/us.klt
prompt
nowarn
timeout=300
message=/boot/message
menu-scheme=wb:bw:wb:bw
image=/boot/vmlinuz

label="LinuxMandrake"
root=/dev/hda1
initrd=/boot/initrd.img
append="quiet devfs=mount hdc=ide-scsi acpi=off"
vga=788
read-only
image=/boot/vmlinuz

label="linux-nonfb"
root=/dev/hda1
initrd=/boot/initrd.img
append="devfs=mount hdc=ide-scsi acpi=off"
read-only
image=/boot/vmlinuz

label="failsafe"
root=/dev/hda1
initrd=/boot/initrd.img
append="failsafe devfs=nomount hdc=ide-scsi acpi=off"
read-only

other=/dev/fd0
label="floppy"
unsafe

Thanks!

spuzzzzzzz 02-01-2004 12:43 AM

Your lilo.conf looks fine. The only change you should make is for readability. Put a blank line before each instance of <label="xxx">. This will allow you to see the boot options more clearly if you ever need to edit them.

I have never repartitioned the hard drive in a running system, but I know it can be done without damaging the data. There was a thread recently about a program called "parted" which can apparently do useful things. Your hard drive should look something like this, once you've repartitioned:
/dev/hda1 mounted on /boot, size 15MB, filesystem ext2
/dev/hda2 mounted on /, size whatever you want, filesystem reiserfs (or ext3)
/dev/hda3 mounted on /home, size whatever you want, filesystem reiserfs (or ext3)
/dev/hda4 size twice your RAM, no filesystem, no mount point

(the order of the partitions doesn't have to be the same, but your /boot partition should be first)
You will then need to change /etc/fstab (although from memory, you might be able to use diskdrake in the Control Center) to reflect the changes. Make sure the /boot partition has the option "noauto". This will prevent it from being mounted auotomatically, which will make its data safe in the event of a system crash.

tigerflag 02-01-2004 09:37 AM

Thanks again!
So, once I make the /boot partition, I take the data from /boot that's now inside of / and just plop it in there? And, I make /boot bootable? I know this sounds dumb to ask, but I spent a long time tweaking this system, doing all the configurations for this woman, and I don't want to screw up and have to start all over.

****************************************************
Some scientists claim that hydrogen, because it is so plentiful, is the basic building block of the universe. I dispute that. I say there is more stupidity than hydrogen, and that is the basic building block of the universe.

Frank Zappa
****************************************************

spuzzzzzzz 02-01-2004 04:14 PM

Basically, yes. But if you install LILO in the MBR (which is the default, I think) you don't need to make /boot bootable. But don't forget to:
1. add "noauto" to the boot partition line in /etc/fstab
2. You should probably run "lilo" again once you've finished to make sure it recognises the changes.


All times are GMT -5. The time now is 06:19 PM.