RHEL 4 release 5 crashes randomly
Hi
My new RHEL 4 running on Dell PE T605 crashes randomly, sitting idle. Below is part of my messages file. any help fixing the problem is appreciated. ____________________________________________________________ Jul 31 12:12:17 T605 syslogd 1.4.1: restart. Jul 31 12:12:17 T605 syslog: syslogd startup succeeded Jul 31 12:12:18 T605 kernel: klogd 1.4.1, log source = /proc/kmsg started. Jul 31 12:12:18 T605 kernel: Bootdata ok (command line is ro root=/dev/VolGroup0 0/LogVol00 rhgb quiet) Jul 31 12:12:18 T605 kernel: Linux version 2.6.9-55.ELsmp (brewbuilder@hs20-bc2- 4.build.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)) #1 SMP Fri Ap r 20 16:36:54 EDT 2007 Jul 31 12:12:18 T605 kernel: BIOS-provided physical RAM map: Jul 31 12:12:18 T605 kernel: BIOS-e820: 0000000000000000 - 00000000000a0000 (us able) Jul 31 12:12:18 T605 kernel: BIOS-e820: 0000000000100000 - 00000000bfac0000 (us able) Jul 31 12:12:18 T605 kernel: BIOS-e820: 00000000bfac0000 - 00000000bfad6000 (re served) Jul 31 12:12:18 T605 kernel: BIOS-e820: 00000000bfad6000 - 00000000bfaf5c00 (AC PI data) Jul 31 12:12:18 T605 kernel: BIOS-e820: 00000000bfaf5c00 - 00000000c0000000 (re served) Jul 31 12:12:18 T605 kernel: BIOS-e820: 00000000f0000000 - 00000000f8000000 (re served) Jul 31 12:12:18 T605 kernel: BIOS-e820: 00000000fe000000 - 0000000100000000 (re served) Jul 31 12:12:18 T605 kernel: BIOS-e820: 0000000100000000 - 00000003ef600000 (us able) Jul 31 12:12:18 T605 syslog: klogd startup succeeded Jul 31 12:12:18 T605 kernel: Enabling SRAT NUMA discovery Jul 31 12:12:18 T605 kernel: SRAT: PXM 0 -> APIC 0 -> Node 0 Jul 31 12:12:18 T605 kernel: SRAT: PXM 0 -> APIC 1 -> Node 0 Jul 31 12:12:18 T605 kernel: SRAT: PXM 0 -> APIC 2 -> Node 0 Jul 31 12:12:18 T605 kernel: SRAT: PXM 0 -> APIC 3 -> Node 0 Jul 31 12:12:18 T605 kernel: SRAT: PXM 1 -> APIC 4 -> Node 1 Jul 31 12:12:18 T605 kernel: SRAT: PXM 1 -> APIC 5 -> Node 1 Jul 31 12:12:18 T605 kernel: SRAT: PXM 1 -> APIC 6 -> Node 1 Jul 31 12:12:18 T605 kernel: SRAT: PXM 1 -> APIC 7 -> Node 1 Jul 31 12:12:18 T605 kernel: SRAT: Node 0 PXM 0 0-9ffff Jul 31 12:12:18 T605 kernel: SRAT: Node 0 PXM 0 0-bfffffff Jul 31 12:12:18 T605 kernel: SRAT: Node 0 PXM 0 0-23fffffff Jul 31 12:12:18 T605 kernel: SRAT: Node 1 PXM 1 240000000-43fffffff Jul 31 12:12:18 T605 kernel: Warning: acpi_table_parse(ACPI_SLIT) returned 0! Jul 31 12:12:18 T605 kernel: Bootmem setup node 0 0000000000000000-000000023ffff fff Jul 31 12:12:18 T605 kernel: Bootmem setup node 1 0000000240000000-00000003ef5ff fff Jul 31 12:12:18 T605 irqbalance: irqbalance startup succeeded Jul 31 12:12:18 T605 kernel: DMI 2.5 present. Jul 31 12:12:18 T605 kernel: ServerWorks chipset detected. Disabling timer routi ------------------------------------------------- [root@T605 log]# last root pts/3 cvs3 Thu Jul 31 13:33 still logged in jj pts/2 soso.c Thu Jul 31 13:20 still logged in jo pts/1 soso1 Thu Jul 31 12:16 - 16:29 (04:13) reboot system boot 2.6.9-55.ELsmp Thu Jul 31 12:12 (04:30) root :0 Tue Jul 29 19:04 - crash (1+17:08) jj pts/2 soso.c Tue Jul 29 16:22 - crash (1+19:49) root pts/1 cvs3 Tue Jul 29 14:51 - crash (1+21:20) reboot system boot 2.6.9-55.ELsmp Tue Jul 29 14:41 (2+02:01) jj pts/3 Tue Jul 29 10:49 - crash (03:51) jj pts/3 Tue Jul 29 09:49 - 10:49 (01:00) jj pts/2 10.50.1.20 Mon Jul 28 17:53 - crash (20:47) jj pts/2 10.50.1.20 Mon Jul 28 17:51 - 17:53 (00:02) jj pts/2 10.50.1.20 Mon Jul 28 17:45 - 17:47 (00:01) jm pts/2 10.50.1.20 Mon Jul 28 16:48 - 16:49 (00:01) jm pts/2 10.50.1.20 Mon Jul 28 14:00 - 14:00 (00:00) jm pts/2 10.50.1.20 Mon Jul 28 13:55 - 14:00 (00:04) jo pts/1 10.50.1.8 Mon Jul 28 09:29 - crash (1+05:11) root pts/1 :0.0 Mon Jul 28 08:25 - 08:25 (00:00) root pts/1 :0.0 Mon Jul 28 07:44 - 08:00 (00:15) root :0 Mon Jul 28 07:41 - 08:31 (00:49) reboot system boot 2.6.9-55.ELsmp Mon Jul 28 07:39 (3+09:03) |
I'd start the diagnostic process with a thorough memory test (e.g. from a livecd, or your distro might provide one in th boot menu).
After that I'd start looking at heat issues and potential disk problems. |
All times are GMT -5. The time now is 10:52 AM. |