LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Kernel panic on rhel4 system. (https://www.linuxquestions.org/questions/linux-server-73/kernel-panic-on-rhel4-system-822723/)

st3venb 07-28-2010 04:39 PM

Kernel panic on rhel4 system.
 
I've been trying to track this kernel panic down for the last few days on this system... I believe it might be autofs/nfs related, but I'm not 100% sure.

Here is the dump in /var/log/messages:
Jul 28 13:25:13 server-a kernel: bad: scheduling while atomic!
Jul 28 13:25:13 server-a kernel:
Jul 28 13:25:13 server-a kernel: Call Trace:<ffffffff803095d1>{schedule+75} <ffffffffa01b7642>{:sunrpc:rpc_wake_up_next+350}
Jul 28 13:25:13 server-a kernel: <ffffffffa01b4a2a>{:sunrpc:__xprt_lock_write_next+70}
Jul 28 13:25:13 server-a kernel: <ffffffffa01b61ee>{:sunrpc:xprt_transmit+1107} <ffffffffa01b7da2>{:sunrpc:__rpc_execute+462}
Jul 28 13:25:13 server-a kernel: <ffffffff80135752>{autoremove_wake_function+0} <ffffffff80135752>{autoremove_wake_function+0}
Jul 28 13:25:13 server-a kernel: <ffffffffa01b379c>{:sunrpc:rpc_call_sync+114} <ffffffffa02033c2>{:nfs:nfs3_rpc_wrapper+38}
Jul 28 13:25:13 server-a kernel: <ffffffffa020361a>{:nfs:nfs3_proc_getattr+138} <ffffffffa01fb5bb>{:nfs:__nfs_revalidate_inode+320}
Jul 28 13:25:13 server-a kernel: <ffffffffa0200355>{:nfs:nfs_pagein_list+75} <ffffffff8030a0f5>{thread_return+0}
Jul 28 13:25:13 server-a kernel: <ffffffff8030a14d>{thread_return+88} <ffffffff801609f0>{read_pages+57}
Jul 28 13:25:13 server-a kernel: <ffffffffa01f75bf>{:nfs:nfs_lookup_revalidate+459}
Jul 28 13:25:13 server-a kernel: <ffffffff80132155>{recalc_task_prio+337} <ffffffff801321e3>{activate_task+124}
Jul 28 13:25:13 server-a kernel: <ffffffff8013271e>{try_to_wake_up+876} <ffffffffa01b8da3>{:sunrpc:rpcauth_lookup_credcache+566}
Jul 28 13:25:13 server-a kernel: <ffffffff80157db7>{audit_update_watch+85} <ffffffff8019012a>{__d_lookup+287}
Jul 28 13:25:13 server-a kernel: <ffffffff80185d2a>{do_lookup+388} <ffffffff801868a2>{__link_path_walk+2508}
Jul 28 13:25:13 server-a kernel: <ffffffff80186d62>{link_path_walk+82} <ffffffff801ece75>{strncpy_from_user+74}
Jul 28 13:25:13 server-a kernel: <ffffffff8015702f>{audit_getname+133} <ffffffff80186faf>{path_lookup+451}
Jul 28 13:25:13 server-a kernel: <ffffffff8018788f>{open_namei+172} <ffffffff80178b6e>{filp_open+80}
Jul 28 13:25:13 server-a kernel: <ffffffff801ece75>{strncpy_from_user+74} <ffffffff8015702f>{audit_getname+133}
Jul 28 13:25:13 server-a kernel: <ffffffff80178d77>{sys_open+57} <ffffffff801103ce>{tracesys+209}
Jul 28 13:25:13 server-a kernel:
Jul 28 13:25:13 server-a kernel: Unable to handle kernel paging request at 00000003a0b8cd68 RIP:
Jul 28 13:25:14 server-a kernel: <ffffffff80309e01>{schedule+2171}
Jul 28 13:25:14 server-a kernel: PML4 4018ef067 PGD 0
Jul 28 13:25:14 server-a kernel: Oops: 0002 [1] SMP

Google'ing the hell out of the log and I just can't turn up anything of use... It seems to only happen when we're using the box to rsync data from one nfs mount point to another, but it happens very randomly.

Wondering if anyone can lend a hand in a direction that might resolve this issue before I take this server out to the desert. :(

AlucardZero 07-28-2010 05:23 PM

What kernel are you running? Update it to the latest RHEL 4 kernel. Also update any nfs and autofs (automount) packages you have installed. Oh and rsync too for the heck of it.

st3venb 07-28-2010 06:36 PM

Unfortunately I don't have that path to travel on... I either have to fix the existing system or sit and wait for a new system to be built to replace it.(Currently in the works, but... probably a few weeks / to a month away)

The problem is, I'll have to wait a while to get that done... and I need to use some tools that heavily rely on nfs working on this server in the mean time.

Quote:

Originally Posted by AlucardZero (Post 4048400)
What kernel are you running? Update it to the latest RHEL 4 kernel. Also update any nfs and autofs (automount) packages you have installed. Oh and rsync too for the heck of it.


AlucardZero 07-28-2010 07:20 PM

How is updating the kernel not a fix to the existing system?

Here: Update the kernel. Also, install, configure, and enable diskdump so you can get more information out of crashes. Don't reboot; wait for it to crash. It will boot into the new kernel. If it crashes again you'll have more debugging information.


All times are GMT -5. The time now is 09:28 PM.