RHEL 5.3 x86 server freezes.
Hi I am facing problem my RHEL 5.3 x86 server gets freezes after some time of peroid. I got error messages in /var/log which is as below, have a look and please suggest:
[root@ussdsmsc02 psp8_4]# tail -f /var/log/messages Apr 23 10:40:14 ussdsmsc02 kernel: cpu 3 hot: high 186, batch 31 used:5 Apr 23 10:40:14 ussdsmsc02 avahi-daemon[5943]: Got SIGQUIT, quitting. Apr 23 10:40:11 ussdsmsc02 hcid[3471]: Got disconnected from the system message bus Apr 23 10:40:15 ussdsmsc02 snmpd[4143]: Connection from UDP: [127.0.0.1]:52594 Apr 23 10:40:15 ussdsmsc02 kernel: cpu 3 cold: high 62, batch 15 used:49 Apr 23 10:40:19 ussdsmsc02 kernel: cpu 4 hot: high 186, batch 31 used:27 Apr 23 10:40:19 ussdsmsc02 kernel: cpu 4 cold: high 62, batch 15 used:49 Apr 23 10:40:19 ussdsmsc02 kernel: cpu 5 hot: high 186, batch 31 used:8 Apr 23 10:40:19 ussdsmsc02 kernel: cpu 5 cold: high 62, batch 15 used:59 Apr 23 10:40:20 ussdsmsc02 kernel: cpu 6 hot: high 186, batch 31 used:14 Apr 23 10:40:20 ussdsmsc02 kernel: cpu 6 cold: high 62, batch 15 used:56 Apr 23 10:40:20 ussdsmsc02 kernel: cpu 7 hot: high 186, batch 31 used:9 Apr 23 10:40:20 ussdsmsc02 kernel: cpu 7 cold: high 62, batch 15 used:53 Apr 23 10:40:20 ussdsmsc02 kernel: HighMem per-cpu: Apr 23 10:40:20 ussdsmsc02 kernel: cpu 0 hot: high 186, batch 31 used:175 Apr 23 10:40:20 ussdsmsc02 kernel: cpu 0 cold: high 62, batch 15 used:13 Apr 23 10:40:20 ussdsmsc02 kernel: cpu 1 hot: high 186, batch 31 used:142 Apr 23 10:40:20 ussdsmsc02 kernel: cpu 1 cold: high 62, batch 15 used:3 Apr 23 10:40:20 ussdsmsc02 kernel: cpu 2 hot: high 186, batch 31 used:162 Apr 23 10:40:36 ussdsmsc02 kernel: cpu 2 cold: high 62, batch 15 used:2 Apr 23 10:40:36 ussdsmsc02 hcid[3471]: Can't open system message bus connection: Failed to connect to socket /var/run/dbus/system_bus_socket: Connection refused Apr 23 10:40:36 ussdsmsc02 kernel: cpu 3 hot: high 186, batch 31 used:96 Apr 23 10:40:37 ussdsmsc02 kernel: cpu 3 cold: high 62, batch 15 used:4 Apr 23 10:40:37 ussdsmsc02 kernel: cpu 4 hot: high 186, batch 31 used:67 Apr 23 10:40:37 ussdsmsc02 kernel: cpu 4 cold: high 62, batch 15 used:3 Apr 23 10:40:37 ussdsmsc02 kernel: cpu 5 hot: high 186, batch 31 used:101 Apr 23 10:40:37 ussdsmsc02 kernel: cpu 5 cold: high 62, batch 15 used:8 Apr 23 10:40:37 ussdsmsc02 kernel: cpu 6 hot: high 186, batch 31 used:177 Apr 23 10:40:37 ussdsmsc02 kernel: cpu 6 cold: high 62, batch 15 used:11 Apr 23 10:40:37 ussdsmsc02 kernel: cpu 7 hot: high 186, batch 31 used:74 Apr 23 10:40:37 ussdsmsc02 kernel: cpu 7 cold: high 62, batch 15 used:8 Apr 23 10:40:37 ussdsmsc02 kernel: Free pages: 11464976kB (11457272kB HighMem) Apr 23 10:40:37 ussdsmsc02 kernel: Active:21985 inactive:22081 dirty:1 writeback:0 unstable:0 free:2866244 slab:198826 mapped-file:5630 mapped-anon:15362 pagetables:919 Apr 23 10:40:37 ussdsmsc02 kernel: DMA free:3588kB min:68kB low:84kB high:100kB active:0kB inactive:0kB present:16384kB pages_scanned:0 all_unreclaimable? yes Apr 23 10:40:37 ussdsmsc02 kernel: lowmem_reserve[]: 0 0 880 12655 Apr 23 10:40:37 ussdsmsc02 kernel: DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no Apr 23 10:40:37 ussdsmsc02 kernel: lowmem_reserve[]: 0 0 880 12655 Apr 23 10:40:37 ussdsmsc02 kernel: Normal free:4116kB min:3756kB low:4692kB high:5632kB active:200kB inactive:288kB present:901120kB pages_scanned:7985 all_unreclaimable? yes Apr 23 10:40:37 ussdsmsc02 kernel: lowmem_reserve[]: 0 0 0 94207 Apr 23 10:40:37 ussdsmsc02 kernel: HighMem free:11457272kB min:512kB low:13088kB high:25664kB active:87740kB inactive:88036kB present:12058620kB pages_scanned:0 all_unreclaimable? no Apr 23 10:40:37 ussdsmsc02 kernel: lowmem_reserve[]: 0 0 0 0 Apr 23 10:40:37 ussdsmsc02 kernel: DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3588kB Apr 23 10:40:37 ussdsmsc02 kernel: DMA32: empty Apr 23 10:40:37 ussdsmsc02 kernel: Normal: 559*4kB 1*8kB 1*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 4116kB Apr 23 10:40:37 ussdsmsc02 kernel: HighMem: 2810*4kB 1684*8kB 1081*16kB 511*32kB 324*64kB 166*128kB 101*256kB 67*512kB 44*1024kB 22*2048kB 2736*4096kB = 11457272kB Apr 23 10:40:37 ussdsmsc02 kernel: 28653 pagecache pages Apr 23 10:40:46 ussdsmsc02 fenced[3401]: cluster is down, exiting Apr 23 10:40:46 ussdsmsc02 gfs_controld[3413]: cluster is down, exiting Apr 23 10:40:46 ussdsmsc02 dlm_controld[3407]: cluster is down, exiting Apr 23 10:40:50 ussdsmsc02 kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0 Apr 23 10:41:04 ussdsmsc02 kernel: Free swap = 12586888kB Apr 23 10:41:04 ussdsmsc02 kernel: Total swap = 12586888kB Apr 23 10:41:04 ussdsmsc02 kernel: Free swap: 12586888kB Apr 23 10:41:04 ussdsmsc02 kernel: 3244031 pages of RAM Apr 23 10:41:05 ussdsmsc02 kernel: 3014655 pages of HIGHMEM Apr 23 10:41:05 ussdsmsc02 kernel: 127497 reserved pages Apr 23 10:41:05 ussdsmsc02 kernel: 19665 pages shared Apr 23 10:41:05 ussdsmsc02 kernel: 0 pages swap cached Apr 23 10:41:05 ussdsmsc02 kernel: 1 pages dirty Apr 23 10:41:05 ussdsmsc02 kernel: 0 pages writeback Apr 23 10:41:05 ussdsmsc02 kernel: 5630 pages mapped Apr 23 10:41:05 ussdsmsc02 kernel: 198826 pages slab Apr 23 10:41:05 ussdsmsc02 kernel: 919 pages pagetables Apr 23 10:41:05 ussdsmsc02 kernel: Out of memory: Killed process 4402 (xfs). Apr 23 10:41:05 ussdsmsc02 kernel: mtrr: type mismatch for f0000000,4000000 old: uncachable new: write-combining Apr 23 10:41:05 ussdsmsc02 kernel: dlm_send invoked oom-killer: gfp_mask=0xd0, order=1, oomkilladj=0 Apr 23 10:41:05 ussdsmsc02 kernel: [<c0457a41>] out_of_memory+0x72/0x1a5 Apr 23 10:41:05 ussdsmsc02 kernel: [<c0458f16>] __alloc_pages+0x216/0x297 Apr 23 10:41:05 ussdsmsc02 kernel: [<c046da90>] cache_alloc_refill+0x26d/0x450 Apr 23 10:41:05 ussdsmsc02 kernel: [<c046d819>] kmem_cache_alloc+0x41/0x4b Apr 23 10:41:05 ussdsmsc02 kernel: [<c05ace4a>] sk_alloc+0x1e/0xdd Apr 23 10:41:05 ussdsmsc02 kernel: [<c05eead4>] inet_create+0x10c/0x228 Apr 23 10:41:05 ussdsmsc02 kernel: [<c05ab122>] __sock_create+0x133/0x213 Apr 23 10:41:05 ussdsmsc02 kernel: [<c05ab20d>] sock_create_kern+0xb/0xe Apr 23 10:41:05 ussdsmsc02 kernel: [<f959cd31>] tcp_connect_to_sock+0x73/0x1e9 [dlm] Apr 23 10:41:05 ussdsmsc02 kernel: [<c041f0e0>] rebalance_tick+0x11f/0x2e4 Apr 23 10:41:05 ussdsmsc02 kernel: [<f959d20d>] process_send_sockets+0x1a/0x14b [dlm] Apr 23 10:41:05 ussdsmsc02 kernel: [<c0431cae>] run_workqueue+0x78/0xb5 Apr 23 10:41:05 ussdsmsc02 kernel: [<f959d1f3>] process_send_sockets+0x0/0x14b [dlm] Apr 23 10:41:05 ussdsmsc02 kernel: [<c0432562>] worker_thread+0xd9/0x10b Apr 23 10:41:05 ussdsmsc02 kernel: [<c041e811>] default_wake_function+0x0/0xc Apr 23 10:41:05 ussdsmsc02 kernel: [<c0432489>] worker_thread+0x0/0x10b Apr 23 10:41:05 ussdsmsc02 kernel: [<c0434971>] kthread+0xc0/0xeb Apr 23 10:41:05 ussdsmsc02 kernel: [<c04348b1>] kthread+0x0/0xeb Apr 23 10:41:05 ussdsmsc02 kernel: [<c0405c53>] kernel_thread_helper+0x7/0x10 Apr 23 10:41:05 ussdsmsc02 kernel: ======================= Apr 23 10:41:05 ussdsmsc02 kernel: Mem-info: Apr 23 10:41:05 ussdsmsc02 kernel: DMA per-cpu: Apr 23 10:41:05 ussdsmsc02 kernel: cpu 0 hot: high 0, batch 1 used:0 Apr 23 10:41:05 ussdsmsc02 kernel: cpu 0 cold: high 0, batch 1 used:0 Apr 23 10:41:05 ussdsmsc02 kernel: cpu 1 hot: high 0, batch 1 used:0 Apr 23 10:41:05 ussdsmsc02 kernel: cpu 1 cold: high 0, batch 1 used:0 Apr 23 10:41:05 ussdsmsc02 kernel: cpu 2 hot: high 0, batch 1 used:0 Apr 23 10:41:05 ussdsmsc02 kernel: cpu 2 cold: high 0, batch 1 used:0 Apr 23 10:41:05 ussdsmsc02 kernel: cpu 3 hot: high 0, batch 1 used:0 Apr 23 10:41:05 ussdsmsc02 kernel: cpu 3 cold: high 0, batch 1 used:0 Apr 23 10:41:05 ussdsmsc02 kernel: cpu 4 hot: high 0, batch 1 used:0 Apr 23 10:41:05 ussdsmsc02 kernel: cpu 4 cold: high 0, batch 1 used:0 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 5 hot: high 0, batch 1 used:0 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 5 cold: high 0, batch 1 used:0 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 6 hot: high 0, batch 1 used:0 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 6 cold: high 0, batch 1 used:0 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 7 hot: high 0, batch 1 used:0 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 7 cold: high 0, batch 1 used:0 Apr 23 10:41:06 ussdsmsc02 kernel: DMA32 per-cpu: empty Apr 23 10:41:06 ussdsmsc02 kernel: Normal per-cpu: Apr 23 10:41:06 ussdsmsc02 kernel: cpu 0 hot: high 186, batch 31 used:173 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 0 cold: high 62, batch 15 used:60 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 1 hot: high 186, batch 31 used:25 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 1 cold: high 62, batch 15 used:59 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 2 hot: high 186, batch 31 used:114 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 2 cold: high 62, batch 15 used:54 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 3 hot: high 186, batch 31 used:0 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 3 cold: high 62, batch 15 used:55 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 4 hot: high 186, batch 31 used:6 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 4 cold: high 62, batch 15 used:60 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 5 hot: high 186, batch 31 used:19 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 5 cold: high 62, batch 15 used:47 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 6 hot: high 186, batch 31 used:10 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 6 cold: high 62, batch 15 used:50 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 7 hot: high 186, batch 31 used:32 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 7 cold: high 62, batch 15 used:57 Apr 23 10:41:06 ussdsmsc02 kernel: HighMem per-cpu: Apr 23 10:41:06 ussdsmsc02 kernel: cpu 0 hot: high 186, batch 31 used:172 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 0 cold: high 62, batch 15 used:5 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 1 hot: high 186, batch 31 used:112 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 1 cold: high 62, batch 15 used:7 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 2 hot: high 186, batch 31 used:159 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 2 cold: high 62, batch 15 used:2 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 3 hot: high 186, batch 31 used:116 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 3 cold: high 62, batch 15 used:1 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 4 hot: high 186, batch 31 used:82 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 4 cold: high 62, batch 15 used:6 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 5 hot: high 186, batch 31 used:156 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 5 cold: high 62, batch 15 used:6 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 6 hot: high 186, batch 31 used:156 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 6 cold: high 62, batch 15 used:11 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 7 hot: high 186, batch 31 used:82 Apr 23 10:41:06 ussdsmsc02 kernel: cpu 7 cold: high 62, batch 15 used:0 Apr 23 10:41:06 ussdsmsc02 kernel: Free pages: 11461904kB (11454348kB HighMem) Apr 23 10:41:06 ussdsmsc02 kernel: Active:22050 inactive:22666 dirty:55 writeback:0 unstable:0 free:2865476 slab:198857 mapped-file:5942 mapped-anon:15373 pagetables:946 Apr 23 10:41:06 ussdsmsc02 kernel: DMA free:3588kB min:68kB low:84kB high:100kB active:0kB inactive:0kB present:16384kB pages_scanned:0 all_unreclaimable? yes Apr 23 10:41:06 ussdsmsc02 kernel: lowmem_reserve[]: 0 0 880 12655 Apr 23 10:41:06 ussdsmsc02 kernel: DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no Apr 23 10:41:07 ussdsmsc02 kernel: lowmem_reserve[]: 0 0 880 12655 Apr 23 10:41:07 ussdsmsc02 kernel: Normal free:3968kB min:3756kB low:4692kB high:5632kB active:472kB inactive:4kB present:901120kB pages_scanned:846 all_unreclaimable? yes Apr 23 10:41:07 ussdsmsc02 kernel: lowmem_reserve[]: 0 0 0 94207 Apr 23 10:41:07 ussdsmsc02 kernel: HighMem free:11454348kB min:512kB low:13088kB high:25664kB active:87728kB inactive:90660kB present:12058620kB pages_scanned:0 all_unreclaimable? no Apr 23 10:41:07 ussdsmsc02 kernel: lowmem_reserve[]: 0 0 0 0 Apr 23 10:41:07 ussdsmsc02 kernel: DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3588kB Apr 23 10:41:07 ussdsmsc02 kernel: DMA32: empty Apr 23 10:41:07 ussdsmsc02 kernel: Normal: 522*4kB 1*8kB 1*16kB 0*32kB 1*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 3968kB Apr 23 10:41:07 ussdsmsc02 kernel: HighMem: 1879*4kB 1708*8kB 1083*16kB 513*32kB 324*64kB 168*128kB 100*256kB 68*512kB 44*1024kB 22*2048kB 2736*4096kB = 11454348kB Apr 23 10:41:07 ussdsmsc02 kernel: 29302 pagecache pages Apr 23 10:41:07 ussdsmsc02 kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0 Apr 23 10:41:07 ussdsmsc02 kernel: Free swap = 12586888kB Apr 23 10:41:07 ussdsmsc02 kernel: Total swap = 12586888kB Apr 23 10:41:07 ussdsmsc02 kernel: Free swap: 12586888kB Broadcast message from root (Fri Apr 23 10:41:07 2010): The system is going down for system halt NOW! Apr 23 10:41:07 ussdsmsc02 kernel: 3244031 pages of RAM Apr 23 10:41:07 ussdsmsc02 kernel: 3014655 pages of HIGHMEM Apr 23 10:41:07 ussdsmsc02 kernel: 127497 reserved pages Apr 23 10:41:07 ussdsmsc02 kernel: 20124 pages shared Apr 23 10:41:07 ussdsmsc02 kernel: 0 pages swap cached Apr 23 10:41:07 ussdsmsc02 kernel: 55 pages dirty Apr 23 10:41:07 ussdsmsc02 kernel: 0 pages writeback Apr 23 10:41:07 ussdsmsc02 kernel: 5942 pages mapped Apr 23 10:41:07 ussdsmsc02 kernel: 198857 pages slab Apr 23 10:41:07 ussdsmsc02 kernel: 946 pages pagetables Apr 23 10:41:07 ussdsmsc02 shutdown[12299]: shutting down for system halt Apr 23 10:41:07 ussdsmsc02 kernel: Out of memory: Killed process 12202 (Xorg). Apr 23 10:41:07 ussdsmsc02 kernel: Xorg: page allocation failure. order:1, mode:0xd0 Apr 23 10:41:07 ussdsmsc02 kernel: [<c0458f83>] __alloc_pages+0x283/0x297 Apr 23 10:41:07 ussdsmsc02 kernel: [<c046da90>] cache_alloc_refill+0x26d/0x450 Apr 23 10:41:07 ussdsmsc02 kernel: [<c046d819>] kmem_cache_alloc+0x41/0x4b Apr 23 10:41:07 ussdsmsc02 kernel: [<c0403d7e>] copy_thread+0x84/0x1e6 Apr 23 10:41:07 ussdsmsc02 kernel: [<c042312e>] copy_process+0xc37/0x1200 Apr 23 10:41:07 ussdsmsc02 kernel: [<c0423945>] do_fork+0x41/0x168 Apr 23 10:41:07 ussdsmsc02 kernel: [<c040318b>] sys_clone+0x28/0x2d Apr 23 10:41:08 ussdsmsc02 kernel: [<c0404f17>] syscall_call+0x7/0xb Apr 23 10:41:08 ussdsmsc02 kernel: ======================= Thanks in advance RGD CJ |
Quote:
:study: |
Quote:
I've seen oom get called with Oracle servers before. But the best thing you can do, is to CALL REDHAT, since you've got RHEL, you're paying for support and access to their knowledgebase. |
Thanks rayfordj and TB0ne. TB0ne I will also log the case in REDHAT on this issue as this server i have not put it in production it's just in initial installation and configuration stage so for first aid perspective I have posted my query. rayfordj this server having memory 12 GB and I have assigned the swap also 12 GB. This server having Intel Quod core cpu, as I mentioned in above line that this server is in initial stage I have only configured the RHEL cluster suite and I have assigned only VIP in a resource not any mount point or application script for testing purpose. So then why this server gets hang if I have not put so much load on this server. Please advice...
Thanks Rgd CJ |
Quote:
You still don't say what this server is doing, what you are loading on it, or provide many details. If it was me, I'd reload the OS, and drop swap to be about 4GB...but again, we can "advice" you to call RedHat. THEY will be able to look through system dumps, logs, etc., and tell you what's going on. |
Thanks I will do the same...
|
All times are GMT -5. The time now is 09:07 AM. |