LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Tracking down what caused the crash... (https://www.linuxquestions.org/questions/linux-newbie-8/tracking-down-what-caused-the-crash-325020/)

jkassemi 05-19-2005 09:30 AM

Tracking down what caused the crash...
 
Hello,

I've been having a problem for a while, but decided it was finally worth looking into. My computer crashes every now and then, and I'd like to know what to do about it...

I run a Slackware 10.1 system with the latest Dropline gnome installation. The programs I generally have running are:

Eclipse (latest version),
Firefox 1.04 / Firefox latest trunk build (whatever I feel like at different times)
XMMS 1.2.10
Azeureus 2.1 - 2.3

I've had the crash occur at random times... I don't think it's Eclipse, because it happens even when Eclipse isn't running... With Windows I was told what the error was before/while the computer BSOD'd, but it currently just freezes and I have no way of accessing anything...

I could just run each one individually, check to see whether or not the crash occurrs, and then narrow it down... If it doesn't happen, try running two at the same time to see if there's a resource conflict or something... But that just doesn't seem like a very effective method and there's got to be something better out there :).

I've tried ALT-CTL-FX to try to change the tty, but the keyboard doesn't respond... the mouse doesn't move... Nothing... So then I have to turn off the computer and turn it back on again... I end up going through the disk check on my first partition, a restart, and a disk check on my second partition. It takes forever and I don't like it...

So... How do I figure out what the error was and where it originated?

Also, since I've been through the disk check so many times, I was wondering if somebody could explain to me what the following mean and how the system works:

partition : inode _ had zero dtime: deleted
and
partition: inode _ i_block is _, should be _. Fixed.

Should those be something I'm worried about? What the heck is an inode? What's zero dtime... and what's an i_block :)

I appreciate the help everyone,
James...

ps - dmesg doesn't work... I guess the log gets deleted on reboot?

freakyg 05-19-2005 11:52 AM

According to webopedia::
Quote:

(ī´nōd) (n.) Data structures that contain information about files in Unix file systems that are created when a file system is created. Each file has an inode and is identified by an inode number (i-number) in the file system where it resides. inodes provide important information on files such as user and group ownership, access mode (read, write, execute permissions) and type.

There are a set number of inodes, which indicates the maximum number of files the system can hold.

A file's inode number can be found using the ls -i command, while the ls -l command will retrieve inode information.
http://www.webopedia.com <---Try it.

jkassemi 05-19-2005 12:33 PM

So they're like file ID numbers... Okay. I can understand that... I took a look for i_block and couldn't find anything though... I think I still like wikipedia, which I forgot to look in before I posted... sorry...

Wikipedia brings up the stat program, which gives a little more information than ls -i...

Thanks for the info... But what about where that crash information is located :)

Take it easy,
James.

apolinsky 05-19-2005 12:36 PM

You could try looking at /var/log/messages for what was going on, but it does not sound like a software issue. It is more like hardware. Have you recently added some memory or something else to the computer?

jkassemi 05-19-2005 02:30 PM

Nope. No hardware changes... but I've been having the issue since I first installed Slackware on this system... So that it's a hardware issue is entirely possible... Here's the /var/log/messages lines related to today... I bolded the area that I believe the crash happened around...

Code:

May 19 00:07:33 localhost -- MARK --
May 19 00:27:33 localhost -- MARK --
May 19 00:47:33 localhost -- MARK --
May 19 01:07:33 localhost -- MARK --
May 19 01:27:33 localhost -- MARK --
May 19 01:47:33 localhost -- MARK --
May 19 02:07:34 localhost -- MARK --
May 19 02:27:34 localhost -- MARK --
May 19 02:47:34 localhost -- MARK --
May 19 03:07:34 localhost -- MARK --
May 19 03:27:34 localhost -- MARK --
May 19 03:47:34 localhost -- MARK --
May 19 04:07:34 localhost -- MARK --
May 19 04:27:34 localhost -- MARK --
May 19 04:47:34 localhost -- MARK --
May 19 05:07:34 localhost -- MARK --
May 19 05:27:34 localhost -- MARK --
May 19 05:47:34 localhost -- MARK --
May 19 06:07:34 localhost -- MARK --
May 19 06:27:34 localhost -- MARK --
May 19 06:47:34 localhost -- MARK --
May 19 06:47:34 localhost logger: ACPI action lid is not defined
May 19 07:05:55 localhost su(pam_unix)[1800]: session opened for user root by (uid=1000)
May 19 07:13:56 localhost su(pam_unix)[1800]: session closed for user root
May 19 07:22:34 localhost su(pam_unix)[2742]: session opened for user root by (uid=1000)
May 19 07:25:40 localhost su(pam_unix)[3832]: session opened for user root by (uid=1000)
May 19 07:30:06 localhost su(pam_unix)[3832]: session closed for user root
May 19 07:34:08 localhost su(pam_unix)[4292]: session opened for user root by (uid=1000)
May 19 07:34:22 localhost su(pam_unix)[4292]: session closed for user root
May 19 07:36:07 localhost su(pam_unix)[2742]: session closed for user root
May 19 07:47:34 localhost -- MARK --
May 19 08:13:23 localhost syslogd 1.4.1: restart.
May 19 08:13:24 localhost kernel: klogd 1.4.1, log source = /proc/kmsg started.
May 19 08:13:24 localhost kernel: BIOS-provided physical RAM map:
May 19 08:13:24 localhost kernel: 502MB LOWMEM available.
May 19 08:13:24 localhost kernel: DMI 2.3 present.
May 19 08:13:24 localhost kernel: Initializing CPU#0
May 19 08:13:24 localhost kernel: Using tsc for high-res timesource
May 19 08:13:24 localhost kernel: Memory: 505736k/515008k available (2871k kernel code, 8788k reserved, 1092k data, 168k init, 0k highmem)
May 19 08:13:24 localhost kernel: CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
May 19 08:13:24 localhost kernel: CPU: L2 Cache: 256K (64 bytes/line)
May 19 08:13:24 localhost kernel: Intel machine check architecture supported.
May 19 08:13:24 localhost kernel: Intel machine check reporting enabled on CPU#0.
May 19 08:13:24 localhost kernel: Enabling fast FPU save and restore... done.
May 19 08:13:24 localhost kernel: Enabling unmasked SIMD FPU exception support... done.
May 19 08:13:24 localhost kernel: Checking 'hlt' instruction... OK.
May 19 08:13:24 localhost kernel: NET: Registered protocol family 16
May 19 08:13:24 localhost kernel: PCI: PCI BIOS revision 2.10 entry at 0xfd87b, last bus=2
May 19 08:13:24 localhost kernel: PCI: Using configuration type 1
May 19 08:13:24 localhost kernel: mtrr: v2.0 (20020519)
May 19 08:13:24 localhost kernel: ACPI: Subsystem revision 20041105
May 19 08:13:24 localhost kernel: ACPI: Interpreter enabled
May 19 08:13:24 localhost kernel: ACPI: Using PIC for interrupt routing
May 19 08:13:24 localhost kernel: ACPI: PCI Root Bridge [PCI0] (00:00)
May 19 08:13:24 localhost kernel: ACPI: Embedded Controller [EC0] (gpe 24)
May 19 08:13:24 localhost kernel: Linux Plug and Play Support v0.97 (c) Adam Belay
May 19 08:13:24 localhost kernel: SCSI subsystem initialized
May 19 08:13:24 localhost kernel: usbcore: registered new driver hub
May 19 08:13:24 localhost kernel: PCI: Using ACPI for IRQ routing
May 19 08:13:24 localhost kernel: ** PCI interrupts are no longer routed automatically.  If this
May 19 08:13:24 localhost kernel: ** causes a device to stop working, it is probably because the
May 19 08:13:24 localhost kernel: ** driver failed to call pci_enable_device().  As a temporary
May 19 08:13:24 localhost kernel: ** workaround, the "pci=routeirq" argument restores the old
May 19 08:13:24 localhost kernel: ** behavior.  If this argument makes the device work again,
May 19 08:13:24 localhost kernel: ** please email the output of "lspci" to bjorn.helgaas@hp.com
May 19 08:13:24 localhost kernel: ** so I can fix the driver.
May 19 08:13:24 localhost kernel: Simple Boot Flag at 0x36 set to 0x1
May 19 08:13:24 localhost kernel: Machine check exception polling timer started.
May 19 08:13:24 localhost kernel: audit: initializing netlink socket (disabled)
May 19 08:13:24 localhost kernel: Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
May 19 08:13:24 localhost kernel: NTFS driver 2.1.22 [Flags: R/W].
May 19 08:13:24 localhost kernel: Initializing Cryptographic API
May 19 08:13:24 localhost kernel: ATI Northbridge, reserving I/O ports 0x3b0 to 0x3bb.
May 19 08:13:24 localhost kernel: Activating ISA DMA hang workarounds.
May 19 08:13:24 localhost kernel: vesafb: framebuffer at 0xec000000, mapped to 0xe0080000, using 3072k, total 8128k
May 19 08:13:24 localhost kernel: vesafb: mode is 1024x768x16, linelength=2048, pages=4
May 19 08:13:24 localhost kernel: vesafb: protected mode interface info at c000:51a9
May 19 08:13:24 localhost kernel: vesafb: scrolling: redraw
May 19 08:13:24 localhost kernel: vesafb: Truecolor: size=0:5:6:5, shift=0:11:5:0
May 19 08:13:24 localhost kernel: fb0: VESA VGA frame buffer device
May 19 08:13:24 localhost kernel: ACPI: AC Adapter [ACAD] (on-line)
May 19 08:13:24 localhost kernel: ACPI: Battery Slot [BAT1] (battery absent)
May 19 08:13:24 localhost kernel: ACPI: Power Button (FF) [PWRF]
May 19 08:13:24 localhost kernel: ACPI: Lid Switch [LID]
May 19 08:13:24 localhost kernel: ACPI: Processor [CPU0] (supports C1 C2)
May 19 08:13:24 localhost kernel: ACPI: Thermal Zone [THRM] (65 C)
May 19 08:13:24 localhost kernel: lp: driver loaded but no devices found
May 19 08:13:24 localhost kernel: serio: i8042 AUX port at 0x60,0x64 irq 12
May 19 08:13:24 localhost kernel: serio: i8042 KBD port at 0x60,0x64 irq 1
May 19 08:13:24 localhost kernel: Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
May 19 08:13:24 localhost kernel: ACPI: PCI interrupt 0000:00:08.0[A] -> GSI 3 (level, low) -> IRQ 3
May 19 08:13:24 localhost kernel: parport0: PC-style at 0x378 (0x778) [PCSPP(,...)]
May 19 08:13:24 localhost kernel: parport0: irq 7 detected
May 19 08:13:24 localhost kernel: lp0: using parport0 (polling).
May 19 08:13:24 localhost kernel: io scheduler noop registered
May 19 08:13:24 localhost kernel: io scheduler anticipatory registered
May 19 08:13:24 localhost kernel: io scheduler deadline registered
May 19 08:13:24 localhost kernel: io scheduler cfq registered
May 19 08:13:24 localhost kernel: elevator: using anticipatory as default io scheduler
May 19 08:13:24 localhost kernel: FDC 0 is a post-1991 82077
May 19 08:13:24 localhost kernel: natsemi dp8381x driver, version 1.07+LK1.0.17, Sep 27, 2002
May 19 08:13:24 localhost kernel:  originally by Donald Becker <becker@scyld.com>
May 19 08:13:24 localhost kernel:  http://www.scyld.com/network/natsemi.html
May 19 08:13:24 localhost kernel:  2.4.x kernel port by Jeff Garzik, Tjeerd Mulder
May 19 08:13:24 localhost kernel: ACPI: PCI interrupt 0000:00:12.0[A] -> GSI 11 (level, low) -> IRQ 11
May 19 08:13:24 localhost kernel: natsemi eth0: NatSemi DP8381[56] at 0xe4003000 (0000:00:12.0), 00:0b:cd:a5:26:3d, IRQ 11, port TP.
May 19 08:13:24 localhost kernel: Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
May 19 08:13:24 localhost kernel: ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
May 19 08:13:24 localhost kernel: hda: max request size: 128KiB
May 19 08:13:24 localhost kernel: hda: 78140160 sectors (40007 MB) w/1768KiB Cache, CHS=65535/16/63
May 19 08:13:24 localhost kernel:  hda: hda1 hda2 < hda5 hda6 hda7 >
May 19 08:13:24 localhost kernel: Uniform CD-ROM driver Revision: 3.20
May 19 08:13:24 localhost kernel: ieee1394: raw1394: /dev/raw1394 device initialized
May 19 08:13:24 localhost kernel: USB Universal Host Controller Interface driver v2.2
May 19 08:13:24 localhost kernel: usbcore: registered new driver usblp
May 19 08:13:24 localhost kernel: drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver
May 19 08:13:24 localhost kernel: Initializing USB Mass Storage driver...
May 19 08:13:24 localhost kernel: usbcore: registered new driver usb-storage
May 19 08:13:24 localhost kernel: USB Mass Storage support registered.
May 19 08:13:24 localhost kernel: mice: PS/2 mouse device common for all mice
May 19 08:13:24 localhost kernel: input: AT Translated Set 2 keyboard on isa0060/serio0
May 19 08:13:24 localhost kernel: input: ImPS/2 Generic Wheel Mouse on isa0060/serio1
May 19 08:13:24 localhost kernel: Advanced Linux Sound Architecture Driver Version 1.0.6 (Sun Aug 15 07:17:53 2004 UTC).
May 19 08:13:24 localhost kernel: ACPI: PCI interrupt 0000:00:06.0[A] -> GSI 5 (level, low) -> IRQ 5
May 19 08:13:24 localhost kernel: ALSA device list:
May 19 08:13:24 localhost kernel:  #0: ALI 5451 at 0x8400, irq 5
May 19 08:13:24 localhost kernel: NET: Registered protocol family 2
May 19 08:13:24 localhost kernel: IP: routing cache hash table of 4096 buckets, 32Kbytes
May 19 08:13:24 localhost kernel: TCP: Hash tables configured (established 32768 bind 65536)
May 19 08:13:24 localhost kernel: ipt_recent v0.3.1: Stephen Frost <sfrost@snowman.net>.  http://snowman.net/projects/ipt_recent/
May 19 08:13:24 localhost kernel: NET: Registered protocol family 1
May 19 08:13:24 localhost kernel: NET: Registered protocol family 17
May 19 08:13:24 localhost kernel: powernow: PowerNOW! Technology present. Can scale: frequency and voltage.
May 19 08:13:24 localhost kernel: powernow: SGTC: 13333
May 19 08:13:24 localhost kernel: powernow: Minimum speed 530 MHz. Maximum speed 1789 MHz.
May 19 08:13:24 localhost kernel: Freeing unused kernel memory: 168k freed
May 19 08:13:24 localhost kernel: Adding 441748k swap on /dev/hda6.  Priority:-1 extents:1
May 19 08:13:24 localhost kernel: NTFS volume version 3.1.
May 19 08:13:24 localhost kernel: Linux Kernel Card Services
May 19 08:13:24 localhost kernel:  options:  [pci] [cardbus] [pm]
May 19 08:13:25 localhost kernel: ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 11 (level, low) -> IRQ 11
May 19 08:13:25 localhost kernel: Yenta: CardBus bridge found at 0000:00:0a.0 [0000:0000]
May 19 08:13:25 localhost kernel: Yenta O2: res at 0x94/0xD4: 00/ea
May 19 08:13:25 localhost kernel: Yenta O2: enabling read prefetch/write burst
May 19 08:13:25 localhost kernel: Yenta: ISA IRQ mask 0x0498, PCI irq 11
May 19 08:13:25 localhost kernel: Socket status: 30000821
May 19 08:13:26 localhost cardmgr[2256]: watching 1 socket
May 19 08:13:26 localhost kernel: ACPI: PCI interrupt 0000:00:02.0[A] -> GSI 10 (level, low) -> IRQ 10
May 19 08:13:26 localhost kernel: ohci_hcd 0000:00:02.0: ALi Corporation USB 1.1 Controller
May 19 08:13:26 localhost kernel: ohci_hcd 0000:00:02.0: irq 10, pci mem 0xe4000000
May 19 08:13:26 localhost kernel: ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 1
May 19 08:13:26 localhost kernel: hub 1-0:1.0: USB hub found
May 19 08:13:26 localhost kernel: hub 1-0:1.0: 4 ports detected
May 19 08:13:27 localhost kernel: usb 1-2: new low speed USB device using ohci_hcd and address 2
May 19 08:13:28 localhost kernel: input: USB HID v1.10 Mouse [Microsoft Microsoft 3-Button Mouse with IntelliEye(TM)] on usb-0000:00:02.0-2
May 19 08:13:28 localhost kernel: usbcore: registered new driver usbhid
May 19 08:13:28 localhost kernel: drivers/usb/input/hid-core.c: v2.0:USB HID core driver
May 19 08:13:31 localhost logger: /etc/rc.d/rc.inet1:  /sbin/ifconfig lo 127.0.0.1
May 19 08:13:31 localhost logger: /etc/rc.d/rc.inet1:  /sbin/route add -net 127.0.0.0 netmask 255.0.0.0 lo
May 19 08:13:31 localhost logger: /etc/rc.d/rc.inet1:  /sbin/dhcpcd -d -t 10 eth0
May 19 08:13:31 localhost kernel: eth0: DSPCFG accepted after 0 usec.
May 19 08:13:41 localhost kernel: eth0: remaining active for wake-on-lan
May 19 08:13:41 localhost logger: /etc/rc.d/rc.inet1:  /sbin/route add default gw 192.168.0.1 metric 1
May 19 08:13:41 localhost logger: SIOCADDRT: Network is unreachable
May 19 08:13:41 localhost logger: /etc/rc.d/rc.hotplug start (entering script)
May 19 08:13:42 localhost logger: /etc/rc.d/rc.hotplug start (exiting script)
May 19 08:13:43 localhost sshd[3168]: Server listening on 0.0.0.0 port 22.
May 19 08:14:05 localhost kernel: ndiswrapper version 1.0 loaded (preempt=no,smp=no)
May 19 08:14:05 localhost /usr/sbin/gpm[3191]: imps2: Auto-detected intellimouse PS/2
May 19 08:14:06 localhost kernel: ndiswrapper: driver lstinds (Linksys,03/10/2004,6.0.0.18) added
May 19 08:14:06 localhost kernel: ACPI: PCI interrupt 0000:02:00.0[A] -> GSI 11 (level, low) -> IRQ 11
May 19 08:14:06 localhost kernel: ndiswrapper: using irq 11
May 19 08:14:07 localhost kernel: wlan0: ndiswrapper ethernet device 00:0f:66:96:63:a4 using driver lstinds
May 19 08:14:07 localhost kernel: wlan0: encryption modes supported: WEP, WPA with TKIP
May 19 08:14:07 localhost logger: /etc/rc.d/rc.inet1:  /sbin/dhcpcd -d -t 10 eth0
May 19 08:14:07 localhost kernel: eth0: DSPCFG accepted after 0 usec.
May 19 08:14:17 localhost kernel: eth0: remaining active for wake-on-lan
May 19 08:14:17 localhost logger: /etc/rc.d/rc.inet1:  /sbin/route add default gw 192.168.0.1 metric 1
May 19 08:14:17 localhost logger: SIOCADDRT: Network is unreachable
May 19 08:14:26 localhost named[3301]: starting BIND 9.3.1
May 19 08:14:27 localhost named[3301]: loading configuration from '/etc/named.conf'
May 19 08:14:27 localhost named[3301]: no IPv6 interfaces found
May 19 08:14:27 localhost named[3301]: listening on IPv4 interface lo, 127.0.0.1#53
May 19 08:14:27 localhost named[3301]: listening on IPv4 interface wlan0, 192.168.0.4#53
May 19 08:14:27 localhost named[3301]: command channel listening on 127.0.0.1#953
May 19 08:14:27 localhost named[3301]: zone 0.0.127.in-addr.arpa/IN: loaded serial 1997022700
May 19 08:14:27 localhost named[3301]: zone localhost/IN: loaded serial 42
May 19 08:14:27 localhost named[3301]: running
May 19 08:14:31 localhost fstab-sync[3454]: removed all generated mount points
May 19 08:14:36 localhost login(pam_unix)[3358]: session opened for user james by (uid=0)
May 19 08:14:56 localhost gconfd (james-3544): starting (version 2.10.0), pid 3544 user 'james'
May 19 08:14:57 localhost gconfd (james-3544): Resolved address "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only configuration source at position 0
May 19 08:14:57 localhost gconfd (james-3544): Resolved address "xml:readwrite:/home/james/.gconf" to a writable configuration source at position 1
May 19 08:14:57 localhost gconfd (james-3544): Resolved address "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only configuration source at position 2
May 19 08:15:14 localhost gconfd (james-3544): Resolved address "xml:readwrite:/home/james/.gconf" to a writable configuration source at position 0
May 19 08:17:55 localhost logger: ACPI group battery / action BAT1 is not defined
May 19 08:20:30 localhost logger: ACPI group battery / action BAT1 is not defined
May 19 08:33:23 localhost -- MARK --
May 19 08:53:23 localhost -- MARK --
May 19 09:13:23 localhost -- MARK --
May 19 09:33:23 localhost -- MARK --
May 19 09:53:23 localhost -- MARK --
May 19 10:13:23 localhost -- MARK --
May 19 10:15:15 localhost su(pam_unix)[4081]: session opened for user root by (uid=1000)
May 19 10:33:24 localhost -- MARK --
May 19 10:53:25 localhost -- MARK --
May 19 11:08:35 localhost logger: ACPI action lid is not defined
May 19 11:08:53 localhost su(pam_unix)[4081]: session closed for user root
May 19 11:09:05 localhost /usr/sbin/gpm[3191]: imps2: Auto-detected intellimouse PS/2
May 19 11:16:23 localhost su(pam_unix)[10453]: session opened for user root by (uid=1000)
May 19 11:18:33 localhost su(pam_unix)[10495]: session opened for user root by (uid=1000)
May 19 11:20:12 localhost su(pam_unix)[10495]: session closed for user root
May 19 11:33:25 localhost -- MARK --
May 19 11:45:03 localhost su(pam_unix)[11033]: session opened for user root by (uid=1000)
May 19 11:47:11 localhost su(pam_unix)[11033]: session closed for user root
May 19 11:52:34 localhost su(pam_unix)[11216]: session opened for user root by (uid=1000)
May 19 12:13:26 localhost -- MARK --
May 19 12:33:26 localhost -- MARK --
May 19 12:37:09 localhost su(pam_unix)[11216]: session closed for user root
May 19 12:53:26 localhost -- MARK --
May 19 13:03:20 localhost su(pam_unix)[10453]: session closed for user root
May 19 13:13:26 localhost -- MARK --
May 19 13:17:22 localhost su(pam_unix)[12982]: session opened for user root by (uid=1000)
May 19 13:22:29 localhost su(pam_unix)[13062]: session opened for user root by (uid=1000)
May 19 13:23:03 localhost su(pam_unix)[13062]: session closed for user root
May 19 13:23:11 localhost gconfd (root-13107): starting (version 2.10.0), pid 13107 user 'root'
May 19 13:23:11 localhost gconfd (root-13107): Resolved address "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only configuration source at position 0
May 19 13:23:11 localhost gconfd (root-13107): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 1
May 19 13:23:11 localhost gconfd (root-13107): Resolved address "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only configuration source at position 2

It's a laptop... and I haven't felt like messing around with it to fix the acpi options for the lid closing and everything... It sits on my desk 24/7/365, so It doesn't really matter that that stuff works... The monitor shuts down right when I close the lid... That's all that I care about there...

What's the -- MARK -- mean and how would I go about figuring out what hardware crashed the comp when there doesn't seem to be any correlation between any action of mine and the crash?

Thanks for the help,
James

apolinsky 05-19-2005 03:47 PM

The 'Mark' is just an indication that the logger is alive and has not logged anythin at a certain time. It does not indicate a problem.

I think the acpi maybe the problem. Please look at this line ACPI: Thermal Zone [THRM] (65 C). I notice you are using the machine without battery. Keeping it plugged in all the time will have it run hotter. Perhaps the ACPI is not reposonding properly to temperature. As a test, would it be possible to disable the acpi?

jkassemi 05-20-2005 02:38 AM

Heh. I'll be honest... I don't understand all the ACPI stuff that well... and I probably should have spent a little more time working on getting the appropriate settings. It's highly unlikely (I hope) that the computer is running that hot...

I had ACPI enabled as well as the powersave governor with the Athlon PowerNow selected... I removed ACPI support, powersave governor support (kept the PowerNow), and recompiled the kernel... I'm running it now, and will post with an update if it happens again...

Thanks apolinsky,

James.

ps- What do you use to monitor your CPU temp? How do you know if it's accurate?

basileus 05-20-2005 02:55 AM

I know of only couple of ways that a linux box can crash (totally):

1) CPU overheating (dirty fans & heatsinks)
2) Filesystem errors (e.g. laptop switching itself on in the backpack :)
3) Cpu Frequency Scaling / powersave features turned on in BIOS
4) Faulty software that accesses hardware directly (such as Nvidia GL drivers)

I doubt the problem is with applications... I've never managed to crash a linux box (in 7 years time) with any typical application.

Try turning of all powersaving features in BIOS, especially the CPU Frequency Scaling. I noticed on my laptop (Compaq Armada 100s) that if the "Clock run enable" was ticked, my 2.6 kernels would freeze in big file transfers. Powernowd and the BIOS "Clock run" were stepping on each other's toes.

jkassemi 05-21-2005 06:15 AM

Quote:

Originally posted by basileus
Try turning of all powersaving features in BIOS, especially the CPU Frequency Scaling. I noticed on my laptop (Compaq Armada 100s) that if the "Clock run enable" was ticked, my 2.6 kernels would freeze in big file transfers. Powernowd and the BIOS "Clock run" were stepping on each other's toes.
Okay... so my experience with turning ACPI off, was, predictably, short-lived. The fan ran all night, keeping me up... On top of that I could fry an egg on the keyboard (laptop)...

So, I turned ACPI on, but kept the CPU scaling and stuff off. I still have powernow running... I get a good deal of performance enhancement from it...

Everything's working fairly well so far... The computer's running nicely, and although the fan hasn't yet stopped, it isn't working full tilt anymore...

I cleaned the fans and all other moving parts today, but they weren't all that filthy to begin with... I'll keep running it this way and see what happens. Thanks everybody,

James.

zborgerd 06-12-2005 12:32 PM

This issue seems to be due to nVidia drivers with certain kernel combinations and specific nVidia cards. This is particularly common while running Mozilla-based programs (Firefox). I had this problem as well for a short time (as have a few others).

If your problems continue to occur, try switching to the opensource "nv" driver instead of the "nvidia" driver for a few days, and see if it resolves it. Alternately, it may help to upgrade or dowgrade the official nVidia driver.


All times are GMT -5. The time now is 01:16 PM.