LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   RHEL 4 release 5 crashes randomly (https://www.linuxquestions.org/questions/linux-software-2/rhel-4-release-5-crashes-randomly-659609/)

mehranalmasi 07-31-2008 06:49 PM

RHEL 4 release 5 crashes randomly
 
Hi
My new RHEL 4 running on Dell PE T605 crashes randomly, sitting idle. Below is part of my messages file.
any help fixing the problem is appreciated.


____________________________________________________________
Jul 31 12:12:17 T605 syslogd 1.4.1: restart.
Jul 31 12:12:17 T605 syslog: syslogd startup succeeded
Jul 31 12:12:18 T605 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Jul 31 12:12:18 T605 kernel: Bootdata ok (command line is ro root=/dev/VolGroup0
0/LogVol00 rhgb quiet)
Jul 31 12:12:18 T605 kernel: Linux version 2.6.9-55.ELsmp (brewbuilder@hs20-bc2-
4.build.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)) #1 SMP Fri Ap
r 20 16:36:54 EDT 2007
Jul 31 12:12:18 T605 kernel: BIOS-provided physical RAM map:
Jul 31 12:12:18 T605 kernel: BIOS-e820: 0000000000000000 - 00000000000a0000 (us
able)
Jul 31 12:12:18 T605 kernel: BIOS-e820: 0000000000100000 - 00000000bfac0000 (us
able)
Jul 31 12:12:18 T605 kernel: BIOS-e820: 00000000bfac0000 - 00000000bfad6000 (re
served)
Jul 31 12:12:18 T605 kernel: BIOS-e820: 00000000bfad6000 - 00000000bfaf5c00 (AC
PI data)
Jul 31 12:12:18 T605 kernel: BIOS-e820: 00000000bfaf5c00 - 00000000c0000000 (re
served)
Jul 31 12:12:18 T605 kernel: BIOS-e820: 00000000f0000000 - 00000000f8000000 (re
served)
Jul 31 12:12:18 T605 kernel: BIOS-e820: 00000000fe000000 - 0000000100000000 (re
served)
Jul 31 12:12:18 T605 kernel: BIOS-e820: 0000000100000000 - 00000003ef600000 (us
able)
Jul 31 12:12:18 T605 syslog: klogd startup succeeded
Jul 31 12:12:18 T605 kernel: Enabling SRAT NUMA discovery
Jul 31 12:12:18 T605 kernel: SRAT: PXM 0 -> APIC 0 -> Node 0
Jul 31 12:12:18 T605 kernel: SRAT: PXM 0 -> APIC 1 -> Node 0
Jul 31 12:12:18 T605 kernel: SRAT: PXM 0 -> APIC 2 -> Node 0
Jul 31 12:12:18 T605 kernel: SRAT: PXM 0 -> APIC 3 -> Node 0
Jul 31 12:12:18 T605 kernel: SRAT: PXM 1 -> APIC 4 -> Node 1
Jul 31 12:12:18 T605 kernel: SRAT: PXM 1 -> APIC 5 -> Node 1
Jul 31 12:12:18 T605 kernel: SRAT: PXM 1 -> APIC 6 -> Node 1
Jul 31 12:12:18 T605 kernel: SRAT: PXM 1 -> APIC 7 -> Node 1
Jul 31 12:12:18 T605 kernel: SRAT: Node 0 PXM 0 0-9ffff
Jul 31 12:12:18 T605 kernel: SRAT: Node 0 PXM 0 0-bfffffff
Jul 31 12:12:18 T605 kernel: SRAT: Node 0 PXM 0 0-23fffffff
Jul 31 12:12:18 T605 kernel: SRAT: Node 1 PXM 1 240000000-43fffffff
Jul 31 12:12:18 T605 kernel: Warning: acpi_table_parse(ACPI_SLIT) returned 0!
Jul 31 12:12:18 T605 kernel: Bootmem setup node 0 0000000000000000-000000023ffff
fff
Jul 31 12:12:18 T605 kernel: Bootmem setup node 1 0000000240000000-00000003ef5ff
fff
Jul 31 12:12:18 T605 irqbalance: irqbalance startup succeeded
Jul 31 12:12:18 T605 kernel: DMI 2.5 present.
Jul 31 12:12:18 T605 kernel: ServerWorks chipset detected. Disabling timer routi


-------------------------------------------------

[root@T605 log]# last
root pts/3 cvs3 Thu Jul 31 13:33 still logged in
jj pts/2 soso.c Thu Jul 31 13:20 still logged in
jo pts/1 soso1 Thu Jul 31 12:16 - 16:29 (04:13)
reboot system boot 2.6.9-55.ELsmp Thu Jul 31 12:12 (04:30)
root :0 Tue Jul 29 19:04 - crash (1+17:08)
jj pts/2 soso.c Tue Jul 29 16:22 - crash (1+19:49)
root pts/1 cvs3 Tue Jul 29 14:51 - crash (1+21:20)
reboot system boot 2.6.9-55.ELsmp Tue Jul 29 14:41 (2+02:01)
jj pts/3 Tue Jul 29 10:49 - crash (03:51)
jj pts/3 Tue Jul 29 09:49 - 10:49 (01:00)
jj pts/2 10.50.1.20 Mon Jul 28 17:53 - crash (20:47)
jj pts/2 10.50.1.20 Mon Jul 28 17:51 - 17:53 (00:02)
jj pts/2 10.50.1.20 Mon Jul 28 17:45 - 17:47 (00:01)
jm pts/2 10.50.1.20 Mon Jul 28 16:48 - 16:49 (00:01)
jm pts/2 10.50.1.20 Mon Jul 28 14:00 - 14:00 (00:00)
jm pts/2 10.50.1.20 Mon Jul 28 13:55 - 14:00 (00:04)
jo pts/1 10.50.1.8 Mon Jul 28 09:29 - crash (1+05:11)
root pts/1 :0.0 Mon Jul 28 08:25 - 08:25 (00:00)
root pts/1 :0.0 Mon Jul 28 07:44 - 08:00 (00:15)
root :0 Mon Jul 28 07:41 - 08:31 (00:49)
reboot system boot 2.6.9-55.ELsmp Mon Jul 28 07:39 (3+09:03)

matthewg42 08-01-2008 12:11 PM

I'd start the diagnostic process with a thorough memory test (e.g. from a livecd, or your distro might provide one in th boot menu).

After that I'd start looking at heat issues and potential disk problems.


All times are GMT -5. The time now is 10:52 AM.