LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   CENTOS Crashes after around ten minutes everytime (https://www.linuxquestions.org/questions/linux-hardware-18/centos-crashes-after-around-ten-minutes-everytime-4175443192/)

JBJ1962 12-27-2012 08:51 PM

CENTOS Crashes after around ten minutes everytime
 
I administer a workstation with 99 GIG of RAM and it crashes each time I turn it on EVEN if I don't do anything , I login to my Centos and work normally for around ten minutes and then the screen LOCKS the way it is. Just hangs. then I have to hold power button for 10 sec to force Shut... each time it happens there is no sign in top pointing to shortage of memory or blah blah ..

here is my MESSAGES LOG and MY ERROR LOG :

Messages LOG
Dec 28 08:31:10 Workstation-B rtkit-daemon[2978]: Sucessfully made thread 2984 of process 2976 (/usr/bin/pulseaudio) owned by '42' RT at priority 5.
Dec 28 08:31:10 Workstation-B gdm-simple-greeter[2964]: Gtk-WARNING: gtkwidget.c:5460: widget not within a GtkWindow
Dec 28 08:31:10 Workstation-B kernel: hda-intel: IRQ timing workaround is activated for card #0. Suggest a bigger bdl_pos_adj.

(here I turned on and logged in to my centos) Dec 28 09:37:19 Workstation-B kernel: type=1400 audit(1356658639.713:4): avc: denied { read } for pid=2991 comm="gdm-session-wor" name="root" dev=dm-0 ino=131073 scontext=system_u:system_r:xdm_t:s0-s0:c0.c1023 tcontext=system_u:object_r:admin_home_t:s0 tclass=dir
Dec 28 09:37:20 Workstation-B seahorse-daemon[3088]: DNS-SD initialization failed: Daemon not running
Dec 28 09:37:20 Workstation-B kernel: fuse init (API version 7.13)
Dec 28 09:37:20 Workstation-B seahorse-daemon[3088]: init gpgme version 1.1.8
Dec 28 09:37:21 Workstation-B pulseaudio[3150]: pid.c: Stale PID file, overwriting.
Dec 28 09:47:04 Workstation-B kernel: radeon 0000:03:00.0: IH ring buffer overflow (0x00000031, 0, 65600) **** the LOCK happens somewhere around here !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

ERROR LOG

Dec 27 16:40:03 Workstation-B kernel: ctxfi: PLL initialization failed!!!
Dec 27 16:40:03 Workstation-B kernel: ctxfi: Preparing pcm playback failed!!!
Dec 27 16:40:03 Workstation-B kernel: ctxfi: PLL initialization failed!!!
Dec 27 16:40:03 Workstation-B kernel: ctxfi: Preparing pcm playback failed!!!

Dec 27 18:31:17 Workstation-B abrtd: Init complete, entering main loop
Dec 27 18:31:24 Workstation-B libvirtd: Could not find keytab file: /etc/libvirt/krb5.tab: No such file or directory
Dec 27 17:20:13 Workstation-B rpcbind: rpcbind terminating on signal. Restart with "rpcbind -w"
Dec 28 09:36:52 Workstation-B automount[2351]: lookup_read_master: lookup(nisplus): couldn't locate nis+ table auto.master
Dec 28 09:36:52 Workstation-B automount[2351]: lookup_read_master: lookup(nisplus): couldn't locate nis+ table auto.master


:banghead:
Please Help me solve this problem, I already searched for it but I couldn't solve !!! :(

I personally suspect this error ---> kernel: radeon 0000:03:00.0: IH ring buffer overflow (0x00000031, 0, 65600)


!!!! ADDITION !!!!

Message from syslogd@Workstation-B at Dec 28 09:48:27 ...
kernel:BUG: soft lockup - CPU#8 stuck for 67s! [pulseaudio:3338]
Dec 28 09:48:27 Workstation-B kernel: Process pulseaudio (pid: 3338, threadinfo ffff8817cfc48000, task ffff8817e05c6ae0)
Dec 28 09:48:27 Workstation-B kernel: Stack:
Dec 28 09:48:27 Workstation-B kernel: 01ff8817cdae1400 0000000000000000 0000000000000000 000000000004406f
Dec 28 09:48:27 Workstation-B kernel: <d> 0000000000000000 00000000ffffffff 0000000000000000 0000000000000000
Dec 28 09:48:27 Workstation-B kernel: <d> ffff8817cfc49be8 ffffffff810a73c2 ffff8817cfc49bf8 ffffffff813f665c
Dec 28 09:48:27 Workstation-B kernel: Call Trace:
Dec 28 09:48:27 Workstation-B kernel: [<ffffffff810a73c2>] ? smp_call_function+0x22/0x30
Dec 28 09:48:27 Workstation-B kernel: [<ffffffff813f665c>] ? cpuidle_latency_notify+0x1c/0x30
Dec 28 09:48:27 Workstation-B kernel: [<ffffffff814ef745>] ? notifier_call_chain+0x55/0x80
Dec 28 09:48:27 Workstation-B kernel: [<ffffffff81096c4a>] ? __blocking_notifier_call_chain+0x5a/0x80
Dec 28 09:48:27 Workstation-B kernel: [<ffffffff81096c86>] ? blocking_notifier_call_chain+0x16/0x20
Dec 28 09:48:27 Workstation-B kernel: [<ffffffff810973cd>] ? update_target+0x9d/0x110
Dec 28 09:48:27 Workstation-B kernel: [<ffffffff81097757>] ? pm_qos_add_requirement+0xa7/0xf0
Dec 28 09:48:27 Workstation-B kernel: [<ffffffffa033d871>] ? snd_pcm_hw_params+0x2f1/0x3b0 [snd_pcm]
Dec 28 09:48:27 Workstation-B kernel: [<ffffffffa033ecc8>] ? snd_pcm_common_ioctl1+0xa8/0xb40 [snd_pcm]
Dec 28 09:48:27 Workstation-B kernel: [<ffffffffa033fa62>] ? snd_pcm_playback_ioctl1+0x42/0x270 [snd_pcm]
Dec 28 09:48:27 Workstation-B kernel: [<ffffffffa034037d>] ? snd_pcm_playback_ioctl+0x3d/0x50 [snd_pcm]
Dec 28 09:48:27 Workstation-B kernel: [<ffffffff81189012>] ? vfs_ioctl+0x22/0xa0
Dec 28 09:48:27 Workstation-B kernel: [<ffffffff811891b4>] ? do_vfs_ioctl+0x84/0x580
Dec 28 09:48:27 Workstation-B kernel: [<ffffffff81189731>] ? sys_ioctl+0x81/0xa0
Dec 28 09:48:27 Workstation-B kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
Dec 28 09:48:27 Workstation-B kernel: Code: e8 6e 52 44 00 0f ae f0 48 8b 7b 30 ff 15 99 bc 9e 00 80 7d c7 00 0f 84 9f fe ff ff f6 43 20 01 0f 84 95 fe ff ff 0f 1f 44 00 00 <f3> 90 f6 43 20 01 75 f8 e9 83 fe ff ff 0f 1f 00 4c 89 ea 4c 89

Ztcoracat 12-27-2012 10:13 PM

Hi:

I looked up this WARNING that you are having:

This member in this thread says that this ( Gtk-WARNING: gtkwidget.c:5628: widget not within a GtkWindow) is a warning and not an error, this is actually a programming fault and not a configuration fault.
It is mentioned on page 2 of the thread.

http://forums.freebsd.org/showthread.php?p=100006

Re: Problem with XDMCP at Centos 6.x x64 (6.3 included)
http://www.centos.org/modules/newbb/...38212&forum=55

Did you just recently perform a fresh installation of your distribution?
If so it is possible that the ISO images/files could of been corrupted-


Quote:

Workstation-B kernel: radeon 0000:03:00.0: IH ring buffer overflow (0x00000031, 0, 65600)
You said that the lock happens somewhere about there- I wonder if it may be a graphics issue?
Hardware failure? I'm not the expert but giving you a few things to debate upon-

Maybe another member will be more knowledgeable and be of more assistance to you but I'm trying to help.

How old is your computer?

These links may be of help:

http://www.centos.org/docs/5/html/5....oot-tools.html
http://www.openlogic.com/wazi/bid/18...S-Linux-Server

Hope this helps

JBJ1962 12-27-2012 11:29 PM

The crash on centos
 
Thanks a lot 4 UR quick response :D

The Workstation is 8 months old, and the centos installation as well.

This Crash happens EVEN if I just leave everything be, works fine for a period of time and then crash, with this error logs

After I saw this I guessed that maybe updating my YUM installed packages and softwares would help and then I did a YUM update but the problem still was there,

I also got this errors:

IRQ timing workaround is activated for card #1. Suggest a bigger bdl_pos_adj

rpcbind: rpcbind terminating on signal. Restart with "rpcbind -w"

I also came up with this command on my workstaion which might be useful !!!

lsmod which lists the modules loaded as the kernel comes up I guess!

fuse 66443 2
ebtable_nat 2009 0
ebtables 18135 1 ebtable_nat
ipt_MASQUERADE 2466 3
iptable_nat 6158 1
nf_nat 22759 2 ipt_MASQUERADE,iptable_nat
xt_CHECKSUM 1303 1
iptable_mangle 3349 1
bridge 79078 0
stp 2173 1 bridge
llc 5546 2 bridge,stp
autofs4 26705 3
sunrpc 261704 1
ipt_REJECT 2351 5
nf_conntrack_ipv4 9506 7 iptable_nat,nf_nat
nf_defrag_ipv4 1483 1 nf_conntrack_ipv4
iptable_filter 2793 1
ip_tables 17831 3 iptable_nat,iptable_mangle,iptable_filter
ip6t_REJECT 4628 2
nf_conntrack_ipv6 8748 2
nf_defrag_ipv6 11981 1 nf_conntrack_ipv6
xt_state 1492 6
nf_conntrack 79357 6 ipt_MASQUERADE,iptable_nat,nf_nat,nf_conntrack_ipv4,nf_conntrack_ipv6,xt_state
ip6table_filter 2889 1
ip6_tables 19458 1 ip6table_filter
ipv6 320841 157 ip6t_REJECT,nf_conntrack_ipv6,nf_defrag_ipv6
vhost_net 30424 0
macvtap 9948 1 vhost_net
macvlan 10084 1 macvtap
tun 17031 2 vhost_net
kvm_intel 52762 0
kvm 312245 1 kvm_intel
uinput 7992 0
ppdev 8537 0
parport_pc 22690 0
parport 36209 2 ppdev,parport_pc
snd_ctxfi 95426 2
snd_hda_codec_hdmi 25616 1
snd_hda_intel 26685 0
snd_hda_codec 111376 2 snd_hda_codec_hdmi,snd_hda_intel
snd_hwdep 6652 1 snd_hda_codec
snd_seq 55759 0
snd_seq_device 6500 1 snd_seq
snd_pcm 84894 4 snd_ctxfi,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec
snd_timer 22411 2 snd_seq,snd_pcm
snd 68869 13 snd_ctxfi,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_seq,snd_seq_device,snd_pcm,sn d_timer
soundcore 7958 1 snd
snd_page_alloc 8470 3 snd_ctxfi,snd_hda_intel,snd_pcm
sg 29350 0
microcode 112653 0
dcdbas 9219 0
serio_raw 4594 0
i2c_i801 11135 0
iTCO_wdt 13934 0
iTCO_vendor_support 3088 1 iTCO_wdt
tg3 144713 0
i7core_edac 18024 0
edac_core 46581 1 i7core_edac
shpchp 32778 0
ext4 370397 3
mbcache 8144 1 ext4
jbd2 92864 1 ext4
sr_mod 15140 0
cdrom 39085 1 sr_mod
firewire_ohci 24695 0
firewire_core 50109 1 firewire_ohci
crc_itu_t 1717 1 firewire_core
sd_mod 38912 3
crc_t10dif 1541 1 sd_mod
ahci 40871 0
mptsas 52594 2
mptscsih 36700 1 mptsas
mptbase 93647 2 mptsas,mptscsih
scsi_transport_sas 35620 1 mptsas
wmi 6287 0
radeon 840112 2
ttm 80957 1 radeon
drm_kms_helper 33820 1 radeon
drm 246709 4 radeon,ttm,drm_kms_helper
i2c_algo_bit 5762 1 radeon
i2c_core 31084 5 i2c_i801,radeon,drm_kms_helper,drm,i2c_algo_bit
dm_mirror 14101 0
dm_region_hash 12042 1 dm_mirror
dm_log 9930 2 dm_mirror,dm_region_hash
dm_mod 80796 11 dm_mirror,dm_log

Ztcoracat 01-02-2013 09:45 PM

I have not be trained to read all of the kernel modules and know what they should contain.

I did however find that the ip6tables are used to set up, maintain and inspect the tables of IPV6 packet filter rules in the Linux kernel. Each table contains an number of built -in chains.

The 'ext4 3703973' in your lsmod output is one of the partition that pertain to your operating system.

I think that 'firewireohci 246950 lib/modules' is one of your kernel mod's-
However I'm still learning myself. Sorry I can't tell you more.

Do you remember; what were you doing just before your system started crashing?

Also; if you are not the only user on that workstation is there a chance that another employee could have accidentally deleted a file that is essential to the OS running?

If I were you I would consider running 'Memtest86' it is a good tool for detecting hardware failure.
That is a possibility as well.
http://www.memtest86.com/download.html


All times are GMT -5. The time now is 01:14 PM.