LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Virtualization and Cloud (http://www.linuxquestions.org/questions/linux-virtualization-and-cloud-90/)
-   -   Bad pte problem - causing server crash (http://www.linuxquestions.org/questions/linux-virtualization-and-cloud-90/bad-pte-problem-causing-server-crash-797525/)

nike.stars 03-24-2010 09:20 AM

Bad pte problem - causing server crash
 
i'm running an centos-openvz server which having a problem being crash after a few hours, i've been checked /var/log/messages and found out some anomalities like this:

Quote:

================================================================
Mar 24 02:01:10 usanode1 kernel: swap_dup: Bad swap file entry b9190673
Mar 24 02:01:10 usanode1 kernel: swap_dup: Bad swap file entry b9190673
Mar 24 02:01:10 usanode1 kernel: Bad pte = 9dc7d00ceab7c25b, process = ???, vm_flags = 100173, vaddr = bfcd3000
Mar 24 02:01:10 usanode1 kernel: [] vm_normal_page+0x7d/0x96
Mar 24 02:01:10 usanode1 kernel: [] unmap_vmas+0x241/0x681
Mar 24 02:01:10 usanode1 kernel: [] exit_mmap+0x77/0xf1
Mar 24 02:01:10 usanode1 kernel: [] mmput+0x25/0x8e
Mar 24 02:01:10 usanode1 kernel: [] do_exit+0x573/0xb86
Mar 24 02:01:10 usanode1 kernel: [] printk+0x18/0x8e
Mar 24 02:01:10 usanode1 kernel: [] ub_slab_uncharge+0x82/0x8c
Mar 24 02:01:10 usanode1 kernel: [] __dequeue_signal+0x111/0x15f
Mar 24 02:01:10 usanode1 kernel: [] sys_exit_group+0x0/0xd
Mar 24 02:01:10 usanode1 kernel: [] do_page_fault+0x0/0x5e1
Mar 24 02:01:10 usanode1 kernel: [] get_signal_to_deliver+0x38d/0x3b4
Mar 24 02:01:10 usanode1 kernel: [] do_page_fault+0x0/0x5e1
Mar 24 02:01:10 usanode1 kernel: [] do_notify_resume+0xa9/0x6a5
Mar 24 02:01:10 usanode1 kernel: [] ub_siginfo_charge+0x8f/0xec
Mar 24 02:01:10 usanode1 kernel: [] signal_wake_up+0x1e/0x2c
Mar 24 02:01:10 usanode1 kernel: [] specific_send_sig_info+0x90/0x9a
Mar 24 02:01:10 usanode1 kernel: [] force_sig_info+0x7e/0x86
Mar 24 02:01:10 usanode1 kernel: [] do_page_fault+0x55d/0x5e1
Mar 24 02:01:10 usanode1 kernel: [] audit_syscall_entry+0x160/0x192
Mar 24 02:01:10 usanode1 kernel: [] do_page_fault+0x0/0x5e1
Mar 24 02:01:10 usanode1 kernel: [] work_notifysig+0x13/0x19
Mar 24 02:01:10 usanode1 kernel: [] __register_kprobe+0x260/0x270
Mar 24 02:01:10 usanode1 kernel: =======================

=======================================================================

Mar 24 04:35:01 usanode1 kernel: Bad pte = fe08bd805874c085, process = ???, vm_flags = 100173, vaddr = bfc7aacd
Mar 24 04:35:01 usanode1 kernel: [] vm_normal_page+0x7d/0x96
Mar 24 04:35:01 usanode1 kernel: [] follow_page+0x160/0x24b
Mar 24 04:35:01 usanode1 kernel: [] get_user_pages+0x2b2/0x344
Mar 24 04:35:01 usanode1 kernel: [] access_process_vm+0x65/0x119
Mar 24 04:35:01 usanode1 kernel: [] proc_pid_cmdline+0x4e/0xd6
Mar 24 04:35:01 usanode1 kernel: [] proc_info_read+0x49/0x96
Mar 24 04:35:01 usanode1 kernel: [] proc_info_read+0x0/0x96
Mar 24 04:35:01 usanode1 kernel: [] vfs_read+0x81/0x123
Mar 24 04:35:01 usanode1 kernel: [] sys_read+0x3c/0x9e
Mar 24 04:35:01 usanode1 kernel: [] syscall_call+0x7/0xb
Mar 24 04:35:01 usanode1 kernel: [] debug_esp_fix_insn+0x2/0xf
Mar 24 04:35:01 usanode1 kernel: =======================
Mar 24 04:35:04 usanode1 kernel: Bad pte = fe08bd805874c085, process = ???, vm_flags = 100173, vaddr = bfc7aacd
Mar 24 04:35:04 usanode1 kernel: [] vm_normal_page+0x7d/0x96
Mar 24 04:35:04 usanode1 kernel: [] follow_page+0x160/0x24b
Mar 24 04:35:04 usanode1 kernel: [] get_user_pages+0x2b2/0x344
Mar 24 04:35:04 usanode1 kernel: [] access_process_vm+0x65/0x119
Mar 24 04:35:04 usanode1 kernel: [] proc_pid_cmdline+0x4e/0xd6
Mar 24 04:35:04 usanode1 kernel: [] proc_info_read+0x49/0x96
Mar 24 04:35:04 usanode1 kernel: [] proc_info_read+0x0/0x96
Mar 24 04:35:04 usanode1 kernel: [] vfs_read+0x81/0x123
Mar 24 04:35:04 usanode1 kernel: [] sys_read+0x3c/0x9e
Mar 24 04:35:04 usanode1 kernel: [] syscall_call+0x7/0xb
Mar 24 04:35:04 usanode1 kernel: [] debug_esp_fix_insn+0x2/0xf
================================================================
i'm suspecting this is maybe related to memory problem, but just wanted to be sure if maybe anyone have similiar experiences?

my current kernel: 2.6.18-164.11.1.el5.028stab068.5PAE #1 SMP Mon Mar 15 19:32:34 MSK 2010 i686 i686 i386 GNU/Linux

business_kid 03-25-2010 06:07 AM

http://lists.us.dell.com/pipermail/l...ne/014879.html
That guy seems pretty confident.
I would try a boot option of 'noapic' to see if it helps things. By all means run swapoff, mkswap, and swapon with your swap partition as an argument, and run memtest. Report back.


All times are GMT -5. The time now is 08:53 PM.