LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 02-16-2014, 09:54 PM   #1
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,050

Rep: Reputation: Disabled
Slack64-14.0 server crash entries in dmesg after a while running


Hi,

SO my home server running Slackware64-14.0 is functioning normally (so it seems) but after a few hours/days (not sure exactly when), something from the running kernel crashes (a module?) and the following entries are visible in dmesg (truncated log until the problem appears):

Code:
[   31.117103] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   40.440181] NET: Registered protocol family 10
[   40.640740] svc: failed to register lockdv1 RPC service (errno 97).
[   40.640907] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[   40.640915] NFSD: unable to find recovery directory /var/lib/nfs/v4recovery
[   40.640918] NFSD: starting 90-second grace period
[   51.210086] eth0: no IPv6 routers present
[   59.804882] xfsettingsd[4055]: segfault at 1 ip 000000000040c261 sp 00007fff28cd9210 error 4 in xfsettingsd[400000+14000]
[   62.337378] ata1.00: configured for UDMA/133
[   62.337381] ata1: EH complete
[   62.390605] ata2.00: configured for UDMA/133
[   62.390608] ata2: EH complete
[   62.418051] ata3.00: configured for UDMA/133
[   62.418055] ata3: EH complete
[   67.222901] EXT4-fs (md2): re-mounted. Opts: commit=0
[   67.225728] EXT4-fs (md0): re-mounted. Opts: commit=0
[   67.227650] EXT4-fs (md3): re-mounted. Opts: data=writeback,stripe=48,barrier=0,errors=remount-ro,commit=0
[264481.220106] INFO: task syslogd:2517 blocked for more than 120 seconds.
[264481.220112] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[264481.220117] syslogd         D ffff88101ec31c40     0  2517      1 0x00000000
[264481.220126]  ffff880fb42cbde8 0000000000000082 ffff880fb42cbd88 ffffffff00000000
[264481.220134]  ffff880fb8226720 ffff880fb42cbfd8 ffff880fb42cbfd8 ffff880fb42cbfd8
[264481.220141]  ffff880fb81044c0 ffff880fb8226720 0000000000000001 0000000100000246
[264481.220148] Call Trace:
[264481.220166]  [<ffffffff81b2fcff>] schedule+0x3f/0x60
[264481.220175]  [<ffffffff8126ae05>] jbd2_log_wait_commit+0xb5/0x130
[264481.220185]  [<ffffffff81074c90>] ? finish_wait+0x80/0x80
[264481.220192]  [<ffffffff8126cc61>] jbd2_complete_transaction+0x51/0xa0
[264481.220200]  [<ffffffff81217548>] ext4_sync_file+0x198/0x3a0
[264481.220210]  [<ffffffff81161795>] do_fsync+0x55/0x80
[264481.220217]  [<ffffffff81161ac0>] sys_fsync+0x10/0x20
[264481.220223]  [<ffffffff81b3246b>] system_call_fastpath+0x16/0x1b
[264481.220244] INFO: task mysqld:3971 blocked for more than 120 seconds.
[264481.220248] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[264481.220253] mysqld          D ffff88101ec11c40     0  3971   3057 0x00000000
[264481.220260]  ffff880ef3c17de8 0000000000000082 ffff880ef3c17d88 ffffffff810f2935
[264481.220267]  ffff880ef3e8d280 ffff880ef3c17fd8 ffff880ef3c17fd8 ffff880ef3c17fd8
[264481.220273]  ffff8804c6afd280 ffff880ef3e8d280 0000000000000001 0000000000000246
[264481.220286] Call Trace:
[264481.220289]  [<ffffffff810f2935>] ? pagevec_lookup_tag+0x25/0x40
[264481.220292]  [<ffffffff81b2fcff>] schedule+0x3f/0x60
[264481.220295]  [<ffffffff8126ae05>] jbd2_log_wait_commit+0xb5/0x130
[264481.220298]  [<ffffffff81074c90>] ? finish_wait+0x80/0x80
[264481.220300]  [<ffffffff8126cc61>] jbd2_complete_transaction+0x51/0xa0
[264481.220303]  [<ffffffff81217548>] ext4_sync_file+0x198/0x3a0
[264481.220307]  [<ffffffff81089cbd>] ? sys_futex+0x8d/0x190
[264481.220310]  [<ffffffff81161795>] do_fsync+0x55/0x80
[264481.220312]  [<ffffffff81161ac0>] sys_fsync+0x10/0x20
[264481.220314]  [<ffffffff81b3246b>] system_call_fastpath+0x16/0x1b
[265561.220103] INFO: task syslogd:2517 blocked for more than 120 seconds.
[265561.220109] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[265561.220115] syslogd         D ffff88101ecd1c40     0  2517      1 0x00000000
[265561.220124]  ffff880fb42cbde8 0000000000000082 ffff880fb42cbd88 ffffffff00000000
[265561.220132]  ffff880fb8226720 ffff880fb42cbfd8 ffff880fb42cbfd8 ffff880fb42cbfd8
[265561.220139]  ffff880fb8185280 ffff880fb8226720 0000000000000001 0000000100000246
[265561.220146] Call Trace:
[265561.220163]  [<ffffffff81b2fcff>] schedule+0x3f/0x60
[265561.220173]  [<ffffffff8126ae05>] jbd2_log_wait_commit+0xb5/0x130
[265561.220182]  [<ffffffff81074c90>] ? finish_wait+0x80/0x80
[265561.220189]  [<ffffffff8126cc61>] jbd2_complete_transaction+0x51/0xa0
[265561.220197]  [<ffffffff81217548>] ext4_sync_file+0x198/0x3a0
[265561.220207]  [<ffffffff81161795>] do_fsync+0x55/0x80
[265561.220214]  [<ffffffff81161ac0>] sys_fsync+0x10/0x20
[265561.220220]  [<ffffffff81b3246b>] system_call_fastpath+0x16/0x1b
I am not sure what exactly this is, but it wasn't there before. This started to appear in dmesg about 2 weeks ago and comes back after a server reboot indicating that something runs, then crashes. These error lines do not appear right after rebooting but within the hours/days after a system restart.

The following elements have me thinking of filesystem corruption: jbd2_log & ext4_sync_file

What are they?
How can I troubleshoot this?
Why are mysqld and syslogd referred in these errors?

The server appears to be running normally.. Mdadm doesnt show any degraded RAID arrays, Smart doesnt report anything unusual..

Server is running Slackware 64 14.0 stock with kernel

Code:
Linux server 3.2.45 #2 SMP Fri May 31 20:14:55 CDT 2013 x86_64 AMD Opteron(tm) Processor 4334 AuthenticAMD GNU/Linux
Other than that:

Code:
root@lhost2:~# free -m
             total       used       free     shared    buffers     cached
Mem:         64315      63764        551          0       1684      47426
-/+ buffers/cache:      14653      49662
Swap:         7998          6       7992
Anybody can help with what seems to be a kernel problem?
Cheers!
Thanks!
 
Old 02-16-2014, 09:59 PM   #2
Emerson
LQ Sage
 
Registered: Nov 2004
Location: Saint Amant, Acadiana
Distribution: Gentoo ~amd64
Posts: 7,675

Rep: Reputation: Disabled
My first guess would be the hard drive is dying.
 
Old 02-16-2014, 10:42 PM   #3
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,050

Original Poster
Rep: Reputation: Disabled
Quote:
My first guess would be the hard drive is dying.
Ok but which one? There are 11 drives in this server.. 5 (2 as RAID1, 3 as RAID0) are using ext4, 6 (all as RAID5) are using XFS.. So if a drive is dying, I suspect one of the 5 drives that are using ext4 and assembled as RAID1 or the RAID0 array...

The 3 drives (out of the 5) used in the RAID0 array are quite old (7+years) so I wouldnt be surprised one of these may be dying. The two other drives used in the RAID1 array are brand new Seagates 2TB and are system drives so critical.

Is there any way to find out which one is (may) be dying so I can plan accordingly or do I have to wait for it to die before I can do anything ?

I already have backups.. I would like to know if I need (and if so when) to buy a replacement drive to resilver ASAP..

Last edited by lpallard; 02-20-2014 at 12:27 PM.
 
Old 02-17-2014, 12:05 AM   #4
Richard Cranium
Senior Member
 
Registered: Apr 2009
Location: McKinney, Texas
Distribution: Slackware64 15.0
Posts: 3,860

Rep: Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230
Well, that's one reason to run smartd. However,
Code:
man smartctl
should tell you what you can run to check the individual drives.
 
Old 02-17-2014, 02:58 AM   #5
mlslk31
Member
 
Registered: Mar 2013
Location: Florida, USA
Distribution: Slackware, FreeBSD
Posts: 210

Rep: Reputation: 77
syslogd and mysqld seem to be waiting on a fsync that never happens, waiting on a journal commit that may not happen. ext4 is the filesystem (of course), and jbd2 is the journal system for ext4. The source of the problem could be anywhere, so try the easy stuff first.

Does a no-modify fsck of the problem ext4 partition report anything? syslogd would be writing to the partition where /var is located, unless you have it writing somewhere else. Also, when this happens, can you run the `sync` command or umount the problem partitions?

Is there any big task running just before fsync begins to degrade or hang?

I'll have to leave the hard stuff to the ext4 pros here...
 
Old 02-17-2014, 07:18 PM   #6
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,050

Original Poster
Rep: Reputation: Disabled
Quote:
Does a no-modify fsck of the problem ext4 partition report anything?
I will login locally on the server, switch in init 1 and run the fsck commands.. I dont want to run fsck on running filesystems unless there's a way to run in "read-only" mode where fsck will only check and do nothing else..



/var is located under / which is mounted from /dev/md2 which uses two identical Seagate 2TB drives assembled as raid1. I imagine it can be any of the two drives having a problem..

No I cannot unmount the partitions as like I said, /var is under /..

Quote:
Is there any big task running just before fsync begins to degrade or hang?
Its difficult to say.. There are no "big" tasks running at any time, but this server operates in a more or less linear mode (all services are started at boot time, and all run along the others). There are no bigger tasks suddenly running. THis server used to run BOINC jobs but hasn't ran any of these in months.

I will back up once more. I had an idea. On top of keeping the OS and critical data on a RAID1 array, doing incremental backups (on the same array but in a different folder against data corruption) and doing mothly backups onto a 3TB hotswappable drive, I would like to burn the most critical data on DVD's and use PAR2 to prevent corruption. That'd be my last line of defense..

Anybody can point out a nice app or script that would take a folder, tar it, split in 4.7GB chunks and compute parity so it can be burned onto DVD's? Ill do a search myself, but if someone knows a good solution, please share!!

Thanks!
 
Old 02-18-2014, 12:08 AM   #7
mlslk31
Member
 
Registered: Mar 2013
Location: Florida, USA
Distribution: Slackware, FreeBSD
Posts: 210

Rep: Reputation: 77
It looks like `e2fsck -fnv <partition_to_check>` is what I'm hoping you'll run. It looks like it works perfectly on an unmounted partition but may throw a false error or two on a read-only partition. Should you need to fix the file system, you'll want a way to do so, be it a utility partition or the Slackware installer disc.

smartctl can be run while the system is up. It seems like `smartctl --test=long <drive_to_check>` can pick up defective sectors better than the shorter tests, but you might wait until you know that a drive is not in an imminent-failure state. `smartctl -a <drive_to_check>` and `smartctl --test=short <drive_to_check>` are safer options.

I'm not suspecting drive failure yet, just trying to get the easy stuff out of the way. It could be that a kernel upgrade of some sort could fix the problem, too, but maybe not. Playing with the I/O scheduler settings might ease the pain a little bit, or not. That's the way it goes.
 
Old 02-18-2014, 01:28 AM   #8
Richard Cranium
Senior Member
 
Registered: Apr 2009
Location: McKinney, Texas
Distribution: Slackware64 15.0
Posts: 3,860

Rep: Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230Reputation: 2230
Note that
Code:
touch /etc/forcefsck
will force fsck on all filesystems on reboot.
 
Old 02-18-2014, 07:12 PM   #9
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,050

Original Poster
Rep: Reputation: Disabled
It looks like my root FS is corrupted... I ran

e2fsck -fnv /dev/md2

and got:

Code:
e2fsck 1.42.6 (21-Sep-2012)
Warning!  /dev/md2 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 16252950 has zero dtime.  Fix? no

Inodes that were part of a corrupted orphan linked list found.  Fix? no

Inode 16252957 was part of the orphaned inode list.  IGNORED.
Inode 16253042 was part of the orphaned inode list.  IGNORED.
Inode 16253185 was part of the orphaned inode list.  IGNORED.
Inode 16253189 was part of the orphaned inode list.  IGNORED.
Inode 16262088 was part of the orphaned inode list.  IGNORED.
Inode 16262094 was part of the orphaned inode list.  IGNORED.
Inode 18617680 was part of the orphaned inode list.  IGNORED.
Inode 18617683 was part of the orphaned inode list.  IGNORED.
Pass 2: Checking directory structure
Entry 'sess_v3mg0nkl81o8nlv6l1qgvmmio1' in /var/lib/php (18089831) has deleted/unused inode 18104057.  Clear? no

Entry 'sess_n9qf29n09p30afjt3skg95s0n6' in /var/lib/php (18089831) has deleted/unused inode 18104046.  Clear? no

Entry 'sess_gvgtimi5i9fbt78pt9ks827lv0' in /var/lib/php (18089831) has deleted/unused inode 18104050.  Clear? no

Entry 'sess_h27rmdr1890ssfltpulad5dli6' in /var/lib/php (18089831) has deleted/unused inode 18104037.  Clear? no

Entry 'sess_jjv77jafbi73tonllerhj16df7' in /var/lib/php (18089831) has deleted/unused inode 18104070.  Clear? no

Entry 'sess_gq51tddkhsklvr765pmjjcjq02' in /var/lib/php (18089831) has deleted/unused inode 18104042.  Clear? no

Entry 'sess_o0dntc5rqnor9jlp4pgjk06834' in /var/lib/php (18089831) has deleted/unused inode 18104064.  Clear? no

Entry 'sess_c2mp3un0kc7fpn6da199v70f40' in /var/lib/php (18089831) has deleted/unused inode 18104054.  Clear? no

Entry 'sess_hii6lfskdf06bsgkjnanr22r94' in /var/lib/php (18089831) has deleted/unused inode 18104058.  Clear? no

Entry 'sess_fh5qrsffvcknjlh320re9ns9e0' in /var/lib/php (18089831) has deleted/unused inode 18104068.  Clear? no

Entry 'sess_s5dlhbubp2h4svoui5en8qk623' in /var/lib/php (18089831) has deleted/unused inode 18104040.  Clear? no

Entry 'sess_j0h33pk971btg5pspf3gvpkmv3' in /var/lib/php (18089831) has deleted/unused inode 18104071.  Clear? no

Entry 'sess_v4iegh0nhl8pc4g0g6qg2akbh2' in /var/lib/php (18089831) has deleted/unused inode 18104062.  Clear? no

Entry 'sess_fmfdfhcck87immgvi41j3ql0d3' in /var/lib/php (18089831) has deleted/unused inode 18104059.  Clear? no

Entry 'sess_ch6nr1nmhm9373i95v905r2tl3' in /var/lib/php (18089831) has deleted/unused inode 18104048.  Clear? no

Entry 'sess_rigfpl1pe2kptc75enbt70b916' in /var/lib/php (18089831) has deleted/unused inode 18104061.  Clear? no

Entry 'sess_keo7rdodle9epf6dku4ede3qd6' in /var/lib/php (18089831) has deleted/unused inode 18104055.  Clear? no

Entry 'sess_asm2p562h019au3us897cchih7' in /var/lib/php (18089831) has deleted/unused inode 18104069.  Clear? no

Entry 'sess_hu3oeto2hlm09lrhtui203s4k0' in /var/lib/php (18089831) has deleted/unused inode 18104067.  Clear? no

Entry 'sess_gm0d62ui9trikqhfgkca1fbfg2' in /var/lib/php (18089831) has deleted/unused inode 18104066.  Clear? no

Entry 'sess_vsk1at2ij3ishk0c2qp6v89ne5' in /var/lib/php (18089831) has deleted/unused inode 18104052.  Clear? no

Entry 'sess_1hhpb36bn9hg4nu9399oqvfal3' in /var/lib/php (18089831) has deleted/unused inode 18104060.  Clear? no

Entry 'sess_oa8c31e1c2upfrmi951q4lqqa4' in /var/lib/php (18089831) has deleted/unused inode 18104038.  Clear? no

Entry 'sess_msrp3vr9pdq2qom2v19sbp5vu2' in /var/lib/php (18089831) has deleted/unused inode 18104072.  Clear? no

Entry 'sess_st0jcbhpnrlhg3vhdpot1d9q33' in /var/lib/php (18089831) has deleted/unused inode 18104047.  Clear? no

Entry 'sess_8dag6p6fhktrgsb69hljjjuip4' in /var/lib/php (18089831) has deleted/unused inode 18104043.  Clear? no

Entry 'sess_ih4l8od33laokmqbr8kfkka317' in /var/lib/php (18089831) has deleted/unused inode 18104036.  Clear? no

Entry 'sess_i0br5ihejf73r5kcdj0m63iga1' in /var/lib/php (18089831) has deleted/unused inode 18104041.  Clear? no

Entry 'sess_q2jj8jgr06p88s0iupagi8pj95' in /var/lib/php (18089831) has deleted/unused inode 18104056.  Clear? no

Entry 'sess_g284i0krkgal5nd93qq1nspai3' in /var/lib/php (18089831) has deleted/unused inode 18104049.  Clear? no

Entry 'sess_2pt6mqf0mphi88dc941t7rkd30' in /var/lib/php (18089831) has deleted/unused inode 18104063.  Clear? no

Entry 'sess_glh6hda12njl030m2ln3bi8mj3' in /var/lib/php (18089831) has deleted/unused inode 18104044.  Clear? no

Entry 'sess_b6sbseg2vspi7nlnbp72l76rp2' in /var/lib/php (18089831) has deleted/unused inode 18104039.  Clear? no

Entry 'sess_n2t2k91k3qbp30811bmd8m3ma3' in /var/lib/php (18089831) has deleted/unused inode 18104051.  Clear? no

Entry 'sess_486ln3khla3q0cdp8pil4hb8j2' in /var/lib/php (18089831) has deleted/unused inode 18104045.  Clear? no

Entry 'sess_fe4od6mf4o9pvr8ftt86gbj7h3' in /var/lib/php (18089831) has deleted/unused inode 18104065.  Clear? no

Entry 'sess_tadsau4ptmtl3cj3hg3r8l9fa7' in /var/lib/php (18089831) has deleted/unused inode 18104053.  Clear? no

^C/dev/md2: e2fsck canceled.

/dev/md2: ********** WARNING: Filesystem still has errors **********
Until I stopped it to force the FS checkup on reboot like Richard Cranium suggested..

Server now running well... Was running well before but I will be keeping an eye on the console to see if dmesg has some stuff for me..
 
Old 02-18-2014, 08:56 PM   #10
mlslk31
Member
 
Registered: Mar 2013
Location: Florida, USA
Distribution: Slackware, FreeBSD
Posts: 210

Rep: Reputation: 77
That didn't look too bad, actually. Should you still have an issue, feel free to report back.
 
Old 02-19-2014, 07:32 PM   #11
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,050

Original Poster
Rep: Reputation: Disabled
Didnt took to long for me to post back here... Once again today, I went to login to my OpenKM setup, and it took longer than usual. I suspoected the problem came back and I was right:

Code:
[74161.224095] INFO: task syslogd:2542 blocked for more than 120 seconds.
[74161.224101] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[74161.224107] syslogd         D ffff88101ec91c40     0  2542      1 0x00000000
[74161.224116]  ffff880d1b6afde8 0000000000000086 ffff880d1b6afd88 ffffffff00000000
[74161.224123]  ffff880fb65c0000 ffff880d1b6affd8 ffff880d1b6affd8 ffff880d1b6affd8
[74161.224130]  ffff880fb8181b80 ffff880fb65c0000 0000000000000001 0000000100000246
[74161.224138] Call Trace:
[74161.224155]  [<ffffffff81b2fcff>] schedule+0x3f/0x60
[74161.224164]  [<ffffffff8126ae05>] jbd2_log_wait_commit+0xb5/0x130
[74161.224173]  [<ffffffff81074c90>] ? finish_wait+0x80/0x80
[74161.224181]  [<ffffffff8126cc61>] jbd2_complete_transaction+0x51/0xa0
[74161.224188]  [<ffffffff81217548>] ext4_sync_file+0x198/0x3a0
[74161.224198]  [<ffffffff81161795>] do_fsync+0x55/0x80
[74161.224204]  [<ffffffff81161ac0>] sys_fsync+0x10/0x20
[74161.224210]  [<ffffffff81b3246b>] system_call_fastpath+0x16/0x1b
[74161.224235] INFO: task mysqld:4003 blocked for more than 120 seconds.
[74161.224238] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[74161.224243] mysqld          D ffff88101ec11c40     0  4003   3089 0x00000000
[74161.224250]  ffff880d2af33de8 0000000000000086 ffff880d2af33d88 ffffffff00000000
[74161.224256]  ffff880d2ca53700 ffff880d2af33fd8 ffff880d2af33fd8 ffff880d2af33fd8
[74161.224270]  ffffffff8200d020 ffff880d2ca53700 0000000000000001 0000000100000246
[74161.224272] Call Trace:
[74161.224275]  [<ffffffff81b2fcff>] schedule+0x3f/0x60
[74161.224278]  [<ffffffff8126ae05>] jbd2_log_wait_commit+0xb5/0x130
[74161.224280]  [<ffffffff81074c90>] ? finish_wait+0x80/0x80
[74161.224283]  [<ffffffff8126cc61>] jbd2_complete_transaction+0x51/0xa0
[74161.224286]  [<ffffffff81217548>] ext4_sync_file+0x198/0x3a0
[74161.224290]  [<ffffffff81089cbd>] ? sys_futex+0x8d/0x190
[74161.224293]  [<ffffffff81161795>] do_fsync+0x55/0x80
[74161.224296]  [<ffffffff81161ac0>] sys_fsync+0x10/0x20
[74161.224298]  [<ffffffff81b3246b>] system_call_fastpath+0x16/0x1b
[74161.224314] INFO: task python:12848 blocked for more than 120 seconds.
[74161.224315] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[74161.224317] python          D ffff88101ec11c40     0 12848      1 0x00000000
[74161.224319]  ffff880b0c1a1de8 0000000000000082 ffff880b0c1a1d88 ffffffff00000000
[74161.224322]  ffff880fb5c4bde0 ffff880b0c1a1fd8 ffff880b0c1a1fd8 ffff880b0c1a1fd8
[74161.224325]  ffffffff8200d020 ffff880fb5c4bde0 0000000000000001 0000000100000246
[74161.224327] Call Trace:
[74161.224330]  [<ffffffff81b2fcff>] schedule+0x3f/0x60
[74161.224333]  [<ffffffff8126ae05>] jbd2_log_wait_commit+0xb5/0x130
[74161.224335]  [<ffffffff81074c90>] ? finish_wait+0x80/0x80
[74161.224338]  [<ffffffff8126cc61>] jbd2_complete_transaction+0x51/0xa0
[74161.224341]  [<ffffffff81217548>] ext4_sync_file+0x198/0x3a0
[74161.224344]  [<ffffffff811348a1>] ? rw_verify_area+0x61/0xf0
[74161.224348]  [<ffffffff81161795>] do_fsync+0x55/0x80
[74161.224350]  [<ffffffff81161ac0>] sys_fsync+0x10/0x20
[74161.224352]  [<ffffffff81b3246b>] system_call_fastpath+0x16/0x1b
[75121.224107] INFO: task syslogd:2542 blocked for more than 120 seconds.
[75121.224113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75121.224119] syslogd         D ffff88101ecd1c40     0  2542      1 0x00000000
[75121.224128]  ffff880d1b6afde8 0000000000000086 ffff880d1b6afd88 ffffffff00000000
[75121.224135]  ffff880fb65c0000 ffff880d1b6affd8 ffff880d1b6affd8 ffff880d1b6affd8
[75121.224142]  ffff880fb8185280 ffff880fb65c0000 0000000000000001 0000000100000246
[75121.224150] Call Trace:
[75121.224167]  [<ffffffff81b2fcff>] schedule+0x3f/0x60
[75121.224176]  [<ffffffff8126ae05>] jbd2_log_wait_commit+0xb5/0x130
[75121.224186]  [<ffffffff81074c90>] ? finish_wait+0x80/0x80
[75121.224193]  [<ffffffff8126cc61>] jbd2_complete_transaction+0x51/0xa0
[75121.224201]  [<ffffffff81217548>] ext4_sync_file+0x198/0x3a0
[75121.224210]  [<ffffffff81161795>] do_fsync+0x55/0x80
[75121.224217]  [<ffffffff81161ac0>] sys_fsync+0x10/0x20
[75121.224223]  [<ffffffff81b3246b>] system_call_fastpath+0x16/0x1b
[75121.224238] INFO: task crond:3068 blocked for more than 120 seconds.
[75121.224242] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75121.224246] crond           D ffff88101ed71c40     0  3068      1 0x00000000
[75121.224253]  ffff880d1b653a58 0000000000000086 ffff880d1b6539f8 ffffffff00000000
[75121.224260]  ffff880fb67414a0 ffff880d1b653fd8 ffff880d1b653fd8 ffff880d1b653fd8
[75121.224266]  ffff880fb81de040 ffff880fb67414a0 ffff880d1b653a38 0000000100000246
[75121.224273] Call Trace:
[75121.224286]  [<ffffffff81b2fcff>] schedule+0x3f/0x60
[75121.224288]  [<ffffffff8126595d>] do_get_write_access+0x2cd/0x4c0
[75121.224291]  [<ffffffff81164139>] ? __find_get_block+0xa9/0x1f0
[75121.224294]  [<ffffffff81074cd0>] ? autoremove_wake_function+0x40/0x40
[75121.224297]  [<ffffffff811646cd>] ? __getblk+0x2d/0x2e0
[75121.224300]  [<ffffffff8114d20a>] ? inode_init_always+0x10a/0x1c0
[75121.224303]  [<ffffffff81265ca0>] jbd2_journal_get_write_access+0x30/0x50
[75121.224306]  [<ffffffff8124892d>] __ext4_journal_get_write_access+0x3d/0x80
[75121.224309]  [<ffffffff81218729>] ext4_new_inode+0x1b9/0x10f0
[75121.224313]  [<ffffffff8123b463>] ? ext4_journal_start_sb+0x73/0x1b0
[75121.224316]  [<ffffffff81226062>] ext4_create+0x92/0x160
[75121.224320]  [<ffffffff81140734>] vfs_create+0xb4/0x120
[75121.224323]  [<ffffffff81143d74>] do_last+0x624/0x900
[75121.224325]  [<ffffffff8114415e>] path_openat+0xce/0x3e0
[75121.224328]  [<ffffffff811444f8>] ? user_path_at_empty+0x68/0xa0
[75121.224331]  [<ffffffff81144591>] do_filp_open+0x41/0xa0
[75121.224334]  [<ffffffff811504ad>] ? alloc_fd+0x4d/0x140
[75121.224338]  [<ffffffff81133f44>] do_sys_open+0xf4/0x1e0
[75121.224340]  [<ffffffff81134050>] sys_open+0x20/0x30
[75121.224342]  [<ffffffff81b3246b>] system_call_fastpath+0x16/0x1b
[75121.224350] INFO: task mysqld:4001 blocked for more than 120 seconds.
[75121.224352] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75121.224353] mysqld          D ffff88101ec11c40     0  4001   3089 0x00000000
[75121.224356]  ffff880d2af47988 0000000000000086 0000000000000000 0000000000000000
[75121.224359]  ffff880d2ca52940 ffff880d2af47fd8 ffff880d2af47fd8 ffff880d2af47fd8
[75121.224361]  ffffffff8200d020 ffff880d2ca52940 ffff880d2af47968 0000000100000246
[75121.224364] Call Trace:
[75121.224367]  [<ffffffff81b2fcff>] schedule+0x3f/0x60
[75121.224369]  [<ffffffff8126595d>] do_get_write_access+0x2cd/0x4c0
[75121.224372]  [<ffffffff81074cd0>] ? autoremove_wake_function+0x40/0x40
[75121.224375]  [<ffffffff81265ca0>] jbd2_journal_get_write_access+0x30/0x50
[75121.224377]  [<ffffffff8124892d>] __ext4_journal_get_write_access+0x3d/0x80
[75121.224380]  [<ffffffff8121f068>] ext4_reserve_inode_write+0x78/0xa0
[75121.224383]  [<ffffffff8122160c>] ? ext4_dirty_inode+0x3c/0x60
[75121.224385]  [<ffffffff8121f0e8>] ext4_mark_inode_dirty+0x58/0x1f0
[75121.224388]  [<ffffffff8122160c>] ext4_dirty_inode+0x3c/0x60
[75121.224391]  [<ffffffff8115cc1f>] __mark_inode_dirty+0x3f/0x220
[75121.224394]  [<ffffffff8114ddf2>] file_update_time+0xd2/0x140
[75121.224398]  [<ffffffff810e9068>] __generic_file_aio_write+0x1f8/0x440
[75121.224402]  [<ffffffff81078a12>] ? hrtimer_cancel+0x22/0x30
[75121.224405]  [<ffffffff810e9321>] generic_file_aio_write+0x71/0xe0
[75121.224407]  [<ffffffff81216e8f>] ext4_file_write+0xaf/0x250
[75121.224411]  [<ffffffff810874d5>] ? futex_wake+0x105/0x130
[75121.224413]  [<ffffffff811342e6>] do_sync_write+0xe6/0x120
[75121.224416]  [<ffffffff810892a0>] ? do_futex+0x100/0xa90
[75121.224420]  [<ffffffff81533a6c>] ? security_file_permission+0x2c/0xb0
[75121.224423]  [<ffffffff811348a1>] ? rw_verify_area+0x61/0xf0
[75121.224426]  [<ffffffff81134bbc>] vfs_write+0xac/0x180
[75121.224428]  [<ffffffff81134eea>] sys_write+0x4a/0x90
[75121.224431]  [<ffffffff81b3246b>] system_call_fastpath+0x16/0x1b
[75121.224433] INFO: task mysqld:4003 blocked for more than 120 seconds.
[75121.224434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75121.224436] mysqld          D ffff88101ecd1c40     0  4003   3089 0x00000000
[75121.224439]  ffff880d2af33de8 0000000000000086 ffff880d2af33d88 ffffffff00000000
[75121.224441]  ffff880d2ca53700 ffff880d2af33fd8 ffff880d2af33fd8 ffff880d2af33fd8
[75121.224444]  ffff880fb8185280 ffff880d2ca53700 0000000000000001 0000000100000246
[75121.224447] Call Trace:
[75121.224449]  [<ffffffff81b2fcff>] schedule+0x3f/0x60
[75121.224452]  [<ffffffff8126ae05>] jbd2_log_wait_commit+0xb5/0x130
[75121.224454]  [<ffffffff81074c90>] ? finish_wait+0x80/0x80
[75121.224457]  [<ffffffff8126cc61>] jbd2_complete_transaction+0x51/0xa0
[75121.224460]  [<ffffffff81217548>] ext4_sync_file+0x198/0x3a0
[75121.224462]  [<ffffffff81089cbd>] ? sys_futex+0x8d/0x190
[75121.224465]  [<ffffffff81161795>] do_fsync+0x55/0x80
[75121.224468]  [<ffffffff81161ac0>] sys_fsync+0x10/0x20
[75121.224470]  [<ffffffff81b3246b>] system_call_fastpath+0x16/0x1b
[75121.224472] INFO: task mysqld:4004 blocked for more than 120 seconds.
[75121.224473] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75121.224475] mysqld          D ffff88101ecd1c40     0  4004   3089 0x00000000
[75121.224478]  ffff880d2af35de8 0000000000000086 ffff880d2af35d88 ffffffff00000000
[75121.224480]  ffff880d2ca53de0 ffff880d2af35fd8 ffff880d2af35fd8 ffff880d2af35fd8
[75121.224483]  ffff880fb8185280 ffff880d2ca53de0 0000000000000001 0000000100000246
[75121.224485] Call Trace:
[75121.224488]  [<ffffffff81b2fcff>] schedule+0x3f/0x60
[75121.224491]  [<ffffffff8126ae05>] jbd2_log_wait_commit+0xb5/0x130
[75121.224493]  [<ffffffff81074c90>] ? finish_wait+0x80/0x80
[75121.224496]  [<ffffffff8126cc61>] jbd2_complete_transaction+0x51/0xa0
[75121.224498]  [<ffffffff81217548>] ext4_sync_file+0x198/0x3a0
[75121.224502]  [<ffffffff81161795>] do_fsync+0x55/0x80
[75121.224504]  [<ffffffff81161ac0>] sys_fsync+0x10/0x20
[75121.224506]  [<ffffffff81b3246b>] system_call_fastpath+0x16/0x1b
[75121.224521] INFO: task python:12848 blocked for more than 120 seconds.
[75121.224523] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[75121.224524] python          D ffff88101ed11c40     0 12848      1 0x00000000
[75121.224527]  ffff880b0c1a1de8 0000000000000082 ffff880b0c1a1d88 ffffffff00000000
[75121.224530]  ffff880fb5c4bde0 ffff880b0c1a1fd8 ffff880b0c1a1fd8 ffff880b0c1a1fd8
[75121.224532]  ffff880fb81d8dc0 ffff880fb5c4bde0 0000000000000001 0000000100000246
[75121.224535] Call Trace:
[75121.224537]  [<ffffffff81b2fcff>] schedule+0x3f/0x60
[75121.224540]  [<ffffffff8126ae05>] jbd2_log_wait_commit+0xb5/0x130
[75121.224543]  [<ffffffff81074c90>] ? finish_wait+0x80/0x80
[75121.224545]  [<ffffffff8126cc61>] jbd2_complete_transaction+0x51/0xa0
[75121.224548]  [<ffffffff81217548>] ext4_sync_file+0x198/0x3a0
[75121.224551]  [<ffffffff81161795>] do_fsync+0x55/0x80
[75121.224553]  [<ffffffff81161ac0>] sys_fsync+0x10/0x20
[75121.224556]  [<ffffffff81b3246b>] system_call_fastpath+0x16/0x1b
Machine is sluggish and opening files/browsing the FS is slow as well.. What next?
 
Old 02-19-2014, 07:44 PM   #12
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,050

Original Poster
Rep: Reputation: Disabled
I ran

smartctl --test=short /dev/sdj

(sdj being one of the participating drive in the "faulty" RAID array), and then I ran

smartctl -a /dev/sdj

to retrieve the test results:

Code:
bash-4.2# smartctl -a /dev/sdj
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.2.45] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda (SATA 3Gb/s, 4K Sectors)
Device Model:     ST2000DM001-1CH164
Serial Number:    S1E1REY8
LU WWN Device Id: 5 000c50 060fb47fd
Firmware Version: CC24
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Wed Feb 19 19:42:42 2014 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(  575) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 ( 217) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x3085)	SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   105   099   006    Pre-fail  Always       -       7776808
  3 Spin_Up_Time            0x0003   095   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       39
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   080   060   030    Pre-fail  Always       -       101348117
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       3869
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       39
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   096   096   000    Old_age   Always       -       4
190 Airflow_Temperature_Cel 0x0022   074   064   045    Old_age   Always       -       26 (Min/Max 22/28)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       12
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       90
194 Temperature_Celsius     0x0022   026   040   000    Old_age   Always       -       26 (0 18 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       121229746900763
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       33625870858
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       10563155383

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      3869         -
# 2  Short offline       Completed without error       00%      3858         -
# 3  Short offline       Completed without error       00%      3834         -
# 4  Short offline       Completed without error       00%      3810         -
# 5  Short offline       Completed without error       00%      3786         -
# 6  Short offline       Completed without error       00%      3762         -
# 7  Short offline       Completed without error       00%      3738         -
# 8  Short offline       Completed without error       00%      3714         -
# 9  Short offline       Completed without error       00%      3690         -
#10  Short offline       Completed without error       00%      3666         -
#11  Short offline       Completed without error       00%      3642         -
#12  Short offline       Completed without error       00%      3618         -
#13  Short offline       Completed without error       00%      3594         -
#14  Short offline       Completed without error       00%      3570         -
#15  Short offline       Completed without error       00%      3546         -
#16  Short offline       Completed without error       00%      3522         -
#17  Short offline       Completed without error       00%      3498         -
#18  Short offline       Completed without error       00%      3474         -
#19  Short offline       Completed without error       00%      3450         -
#20  Short offline       Completed without error       00%      3426         -
#21  Short offline       Completed without error       00%      3402         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Then for the other drive (sdk):

Code:
bash-4.2# smartctl -a /dev/sdk
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.2.45] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda (SATA 3Gb/s, 4K Sectors)
Device Model:     ST2000DM001-1CH164
Serial Number:    S1E1RH1L
LU WWN Device Id: 5 000c50 060fae855
Firmware Version: CC24
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Wed Feb 19 19:46:27 2014 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(  575) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 ( 210) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x3085)	SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   118   099   006    Pre-fail  Always       -       189150976
  3 Spin_Up_Time            0x0003   095   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       42
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   080   060   030    Pre-fail  Always       -       4395936207
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       3873
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       42
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   089   089   000    Old_age   Always       -       11
190 Airflow_Temperature_Cel 0x0022   073   064   045    Old_age   Always       -       27 (Min/Max 22/29)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       13
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       108
194 Temperature_Celsius     0x0022   027   040   000    Old_age   Always       -       27 (0 18 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       158832185577246
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       36909133199
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       7977923618

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      3873         -
# 2  Short offline       Completed without error       00%      3862         -
# 3  Short offline       Completed without error       00%      3838         -
# 4  Short offline       Completed without error       00%      3814         -
# 5  Short offline       Completed without error       00%      3790         -
# 6  Short offline       Completed without error       00%      3766         -
# 7  Short offline       Completed without error       00%      3742         -
# 8  Short offline       Completed without error       00%      3718         -
# 9  Short offline       Completed without error       00%      3694         -
#10  Short offline       Completed without error       00%      3670         -
#11  Short offline       Completed without error       00%      3646         -
#12  Short offline       Completed without error       00%      3622         -
#13  Short offline       Completed without error       00%      3598         -
#14  Short offline       Completed without error       00%      3574         -
#15  Short offline       Completed without error       00%      3550         -
#16  Short offline       Completed without error       00%      3526         -
#17  Short offline       Completed without error       00%      3502         -
#18  Short offline       Completed without error       00%      3478         -
#19  Short offline       Completed without error       00%      3454         -
#20  Short offline       Completed without error       00%      3430         -
#21  Short offline       Completed without error       00%      3406         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Code:
root@lhost2:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 
md2 : active raid1 sdj5[0] sdk5[1]
      1945058304 blocks [2/2] [UU]
      
md1 : active raid1 sdj2[2] sdk2[3]
      8190976 blocks super 1.2 [2/2] [UU]
      
md0 : active raid1 sdj1[0] sdk1[1]
      262080 blocks [2/2] [UU]
      
md3 : active raid0 sdd1[0] sdi1[2] sde1[1]
      869337856 blocks super 1.2 64k chunks
      
md4 : active raid5 sda1[7] sdh1[6] sdg1[4] sdf1[3] sdc1[2] sdb1[1]
      9766909440 blocks super 1.2 level 5, 512k chunk, algorithm 2 [6/6] [UUUUUU]
Im not so worried about an actual drive failing, but more about data corruption. What of data has begun to corrupt and I back it up? Then my backup will be corrupted as well... I do use rsync with monthly incremental backups but even with that.
'
If I am losing a drive AGAIN, I am spending the money to go down the SAS enterprise road.. Im tired of wasting drives that are obviously not designed to run 24/7..

Last edited by lpallard; 02-19-2014 at 07:54 PM.
 
Old 02-19-2014, 07:55 PM   #13
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,050

Original Poster
Rep: Reputation: Disabled
Mysql seems not to like this at all:

Code:
140216 23:45:03 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140216 23:45:05 [Warning] Using unique option prefix myisam_recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
140216 23:45:05 [Note] Plugin 'FEDERATED' is disabled.
140216 23:45:05 InnoDB: The InnoDB memory heap is disabled
140216 23:45:05 InnoDB: Mutexes and rw_locks use GCC atomic builtins
140216 23:45:05 InnoDB: Compressed tables use zlib 1.2.6
140216 23:45:05 InnoDB: Using Linux native AIO
140216 23:45:05 InnoDB: Initializing buffer pool, size = 48.0G
140216 23:47:22 InnoDB: Completed initialization of buffer pool
140216 23:47:34 InnoDB: highest supported file format is Barracuda.
140216 23:49:55  InnoDB: Waiting for the background threads to start
140216 23:49:56 Percona XtraDB (http://www.percona.com) 5.5.33-29.3 started; log sequence number 2199512596935
140216 23:49:58 [Note] Event Scheduler: Loaded 0 events
140216 23:49:58 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.5.33-log'  socket: '/var/run/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)
140217 18:02:23 [Note] /usr/libexec/mysqld: Normal shutdown

140217 18:02:23 [Note] Event Scheduler: Purging the queue. 0 events
140217 18:02:24  InnoDB: Starting shutdown...
140217 18:02:29  InnoDB: Shutdown completed; log sequence number 2199526199421
140217 18:02:29 [Note] /usr/libexec/mysqld: Shutdown complete

140217 18:02:29 mysqld_safe mysqld from pid file /var/run/mysql/mysql.pid ended
140217 18:04:52 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140217 18:04:52 [Warning] Using unique option prefix myisam_recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
140217 18:04:52 [Note] Plugin 'FEDERATED' is disabled.
140217 18:04:52 InnoDB: The InnoDB memory heap is disabled
140217 18:04:52 InnoDB: Mutexes and rw_locks use GCC atomic builtins
140217 18:04:52 InnoDB: Compressed tables use zlib 1.2.6
140217 18:04:52 InnoDB: Using Linux native AIO
140217 18:04:52 InnoDB: Initializing buffer pool, size = 48.0G
140217 18:04:55 InnoDB: Completed initialization of buffer pool
140217 18:04:55 InnoDB: highest supported file format is Barracuda.
140217 18:04:58  InnoDB: Waiting for the background threads to start
140217 18:04:59 Percona XtraDB (http://www.percona.com) 5.5.33-29.3 started; log sequence number 2199526199421
140217 18:05:00 [Note] Event Scheduler: Loaded 0 events
140217 18:05:00 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.5.33-log'  socket: '/var/run/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

140218 18:53:26 [Note] /usr/libexec/mysqld: Normal shutdown

140218 18:53:26 [Note] Event Scheduler: Purging the queue. 0 events
140218 18:53:28 [Warning] /usr/libexec/mysqld: Forcing close of thread 456  user: 'root'

140218 18:53:28 [Warning] /usr/libexec/mysqld: Forcing close of thread 426  user: 'root'

140218 18:53:28 [Warning] /usr/libexec/mysqld: Forcing close of thread 404  user: 'openkm'

140218 18:53:28 [Warning] /usr/libexec/mysqld: Forcing close of thread 399  user: 'openkm'

140218 18:53:28 [Warning] /usr/libexec/mysqld: Forcing close of thread 397  user: 'openkm'

140218 18:53:28 [Warning] /usr/libexec/mysqld: Forcing close of thread 396  user: 'openkm'

140218 18:53:28 [Warning] /usr/libexec/mysqld: Forcing close of thread 278  user: 'root'

140218 18:53:28 [Warning] /usr/libexec/mysqld: Forcing close of thread 212  user: 'root'

140218 18:53:28 [Warning] /usr/libexec/mysqld: Forcing close of thread 62  user: 'root'

140218 18:53:28 [Warning] /usr/libexec/mysqld: Forcing close of thread 58  user: 'root'

140218 18:53:28 [Warning] /usr/libexec/mysqld: Forcing close of thread 46  user: 'root'

140218 18:53:28 [Warning] /usr/libexec/mysqld: Forcing close of thread 45  user: 'root'

140218 18:53:28  InnoDB: Starting shutdown...
140218 18:53:33  InnoDB: Shutdown completed; log sequence number 2212452527337
140218 18:53:33 [Note] /usr/libexec/mysqld: Shutdown complete

140218 18:53:33 mysqld_safe mysqld from pid file /var/run/mysql/mysql.pid ended
140218 19:00:50 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140218 19:00:50 [Warning] Using unique option prefix myisam_recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
140218 19:00:50 [Note] Plugin 'FEDERATED' is disabled.
140218 19:00:50 InnoDB: The InnoDB memory heap is disabled
140218 19:00:50 InnoDB: Mutexes and rw_locks use GCC atomic builtins
140218 19:00:50 InnoDB: Compressed tables use zlib 1.2.6
140218 19:00:50 InnoDB: Using Linux native AIO
140218 19:00:51 InnoDB: Initializing buffer pool, size = 48.0G
140218 19:00:54 InnoDB: Completed initialization of buffer pool
140218 19:00:54 InnoDB: highest supported file format is Barracuda.
140218 19:00:57  InnoDB: Waiting for the background threads to start
140218 19:00:58 Percona XtraDB (http://www.percona.com) 5.5.33-29.3 started; log sequence number 2212452527337
140218 19:00:58 [Note] Event Scheduler: Loaded 0 events
140218 19:00:58 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.5.33-log'  socket: '/var/run/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)
InnoDB: Warning: a long semaphore wait:
--Thread 139829732579072 has waited at row0row.c line 797 for 241.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7f2052a69550 '&block->lock'
a writer (thread id 139829732579072) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 0, lock_word: ffffffffffffffff
Last time read locked in file buf0flu.c line 1340
Last time write locked in file /tmp/Percona-Server-5.5.33/storage/innobase/row/row0row.c line 797
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 0, pwrites 0

=====================================
140219 15:42:19 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 51 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 60463 1_second, 60231 sleeps, 5727 10_second, 10889 background, 10889 flush
srv_master_thread log flush and writes: 73393
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 76432, signal count 214396
--Thread 139829732579072 has waited at row0row.c line 797 for 247.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7f2052a69550 '&block->lock'
a writer (thread id 139829732579072) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 0, lock_word: ffffffffffffffff
Last time read locked in file buf0flu.c line 1340
Last time write locked in file /tmp/Percona-Server-5.5.33/storage/innobase/row/row0row.c line 797
--Thread 139774700103424 has waited at btr0cur.c line 568 for 117.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7f1fcc13d438 '&new_index->lock'
a writer (thread id 139774700103424) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 0, lock_word: ffffffffffffffff
Last time read locked in file btr0cur.c line 575
Last time write locked in file /tmp/Percona-Server-5.5.33/storage/innobase/btr/btr0cur.c line 568
Mutex spin waits 197900, rounds 663542, OS waits 14888
RW-shared spins 87134, rounds 1651290, OS waits 36323
RW-excl spins 50233, rounds 1225817, OS waits 22675
Spin rounds per wait: 3.35 mutex, 18.95 RW-shared, 24.40 RW-excl
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
[...]
I/O thread 63 state: waiting for completed aio requests (write thread)
I/O thread 64 state: waiting for completed aio requests (write thread)
I/O thread 65 state: waiting for completed aio requests (write thread)
Pending normal aio reads: 0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] ,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 1
695551 OS file reads, 3125390 OS file writes, 129948 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 71463, seg size 71465, 3932 merges
merged operations:
 insert 36760, delete mark 205706, delete 16130
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 101998957, node heap has 301772 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 2248255209488
Log flushed up to   2248254962791
Last checkpoint at  2247911015923
Max checkpoint age    1303175763
Checkpoint age target 1262451521
Modified age          342439302
Checkpoint age        344193565
0 pending log writes, 0 pending chkp writes
55148 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 52923727872; in additional pool allocated 0
Total memory allocated by read views 2056
Internal hash tables (constant factor + variable factor)
    Adaptive hash index 5760228320 	(815991656 + 4944236664)
    Page hash           51000296 (buffer pool 0 only)
    Dictionary cache    205373173 	(204000176 + 1372997)
    File system         83536 	(82672 + 864)
    Lock system         127508752 	(127499608 + 9144)
    Recovery system     0 	(0 + 0)
Dictionary memory allocated 1372997
Buffer pool size        3145727
Buffer pool size, bytes 51539591168
Free buffers            298653
Database pages          2545302
Old database pages      939554
Modified db pages       31852
Pending reads 0
Pending writes: LRU 0, flush list 76, single page 0
Pages made young 7196, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 695540, created 1849762, written 3005035
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 2545302, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
1 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
1 transactions active inside InnoDB
1 out of 1000 descriptors used
---OLDEST VIEW---
Normal read view
Read view low limit trx n:o 554309B
Read view up limit trx id 5542F95
Read view low limit trx id 554309B
Read view individually stored trx ids:
Read view trx id 5542F95
-----------------
Main thread process no. 3838, id 139774708496128, state: flushing log
Number of rows inserted 140144519, updated 3488615, deleted 4454, read 611833844
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
------------
TRANSACTIONS
------------
Trx id counter 55431F1
Purge done for trx's n:o < 55427F8 undo n:o < 2
History list length 1498
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 5542EAB, not started
MySQL thread id 328, OS thread handle 0x7f2ca580e700, query id 165744796 localhost root
---TRANSACTION 55428DB, not started
MySQL thread id 333, OS thread handle 0x7f2ca20a4700, query id 165742661 localhost root
---TRANSACTION 55428BE, not started
MySQL thread id 332, OS thread handle 0x7f2ca2106700, query id 165742374 localhost root
---TRANSACTION 55431C0, not started
MySQL thread id 331, OS thread handle 0x7f2ca2199700, query id 165747248 localhost root
---TRANSACTION 552DFCA, not started
[...]
---TRANSACTION 5539CB4, not started
MySQL thread id 66, OS thread handle 0x7f2ca276e700, query id 165694346 localhost root
---TRANSACTION 5536D4B, not started
MySQL thread id 57, OS thread handle 0x7f2ca57ac700, query id 165742934 localhost root
---TRANSACTION 554309B, not started
MySQL thread id 50, OS thread handle 0x7f2ca5719700, query id 165746331 localhost root
---TRANSACTION 553D229, not started
MySQL thread id 51, OS thread handle 0x7f2ca279f700, query id 165709897 localhost root
----------------------------
END OF INNODB MONITOR OUTPUT
============================

=====================================
140219 15:44:42 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 16 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 60463 1_second, 60231 sleeps, 5727 10_second, 10889 background, 10889 flush
srv_master_thread log flush and writes: 73393
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 76434, signal count 214398
--Thread 139829732579072 has waited at row0row.c line 797 for 390.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7f2052a69550 '&block->lock'
a writer (thread id 139829732579072) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 0, lock_word: ffffffffffffffff
Last time read locked in file buf0flu.c line 1340
Last time write locked in file /tmp/Percona-Server-5.5.33/storage/innobase/row/row0row.c line 797
--Thread 139774700103424 has waited at btr0cur.c line 568 for 260.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7f1fcc13d438 '&new_index->lock'
InnoDB: ###### Diagnostic info printed to the standard error stream
a writer (thread id 139774700103424) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 0, lock_word: ffffffffffffffff
Last time read locked in file btr0cur.c line 575
Last time write locked in file /tmp/Percona-Server-5.5.33/storage/innobase/btr/btr0cur.c line 568
Mutex spin waits 197901, rounds 663602, OS waits 14890
RW-shared spins 87134, rounds 1651290, OS waits 36323
RW-excl spins 50233, rounds 1225817, OS waits 22675
Spin rounds per wait: 3.35 mutex, 18.95 RW-shared, 24.40 RW-excl
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
[...]
I/O thread 63 state: waiting for completed aio requests (write thread)
I/O thread 64 state: waiting for completed aio requests (write thread)
I/O thread 65 state: waiting for completed aio requests (write thread)
Pending normal aio reads: 0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] ,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 1
695551 OS file reads, 3125390 OS file writes, 129948 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 71463, seg size 71465, 3932 merges
merged operations:
 insert 36760, delete mark 205706, delete 16130
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 101998957, node heap has 301772 buffer(s)
2084.06 hash searches/s, 543.78 non-hash searches/s
---
LOG
---
Log sequence number 2248255368828
Log flushed up to   2248254962791
Last checkpoint at  2247911015923
Max checkpoint age    1303175763
Checkpoint age target 1262451521
Modified age          342598642
Checkpoint age        344352905
0 pending log writes, 0 pending chkp writes
55148 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 52923727872; in additional pool allocated 0
Total memory allocated by read views 1952
Internal hash tables (constant factor + variable factor)
    Adaptive hash index 5760228320 	(815991656 + 4944236664)
    Page hash           51000296 (buffer pool 0 only)
    Dictionary cache    205373173 	(204000176 + 1372997)
    File system         83536 	(82672 + 864)
    Lock system         127508376 	(127499608 + 8768)
    Recovery system     0 	(0 + 0)
Dictionary memory allocated 1372997
Buffer pool size        3145727
Buffer pool size, bytes 51539591168
Free buffers            298652
Database pages          2545303
Old database pages      939554
Modified db pages       31896
Pending reads 0
Pending writes: LRU 0, flush list 76, single page 0
Pages made young 7196, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 695540, created 1849763, written 3005035
0.00 reads/s, 0.01 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 2545303, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
1 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
1 transactions active inside InnoDB
1 out of 1000 descriptors used
---OLDEST VIEW---
Normal read view
Read view low limit trx n:o 554309B
Read view up limit trx id 5542F95
Read view low limit trx id 554309B
Read view individually stored trx ids:
Read view trx id 5542F95
-----------------
Main thread process no. 3838, id 139774708496128, state: flushing log
Number of rows inserted 140144555, updated 3488846, deleted 4482, read 611921392
2.25 inserts/s, 14.44 updates/s, 1.75 deletes/s, 5471.41 reads/s
------------
TRANSACTIONS
------------
Trx id counter 55434E6
Purge done for trx's n:o < 55427F8 undo n:o < 2
History list length 1729
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 5542EAB, not started
MySQL thread id 328, OS thread handle 0x7f2ca580e700, query id 165744796 localhost root
---TRANSACTION 55428DB, not started
MySQL thread id 333, OS thread handle 0x7f2ca20a4700, query id 165742661 localhost root
---TRANSACTION 55428BE, not started
MySQL thread id 332, OS thread handle 0x7f2ca2106700, query id 165742374 localhost root
---TRANSACTION 552DFCA, not started
MySQL thread id 304, OS thread handle 0x7f2ca57dd700, query id 165640832 localhost root
---TRANSACTION 5398997, not started
[...]
MySQL thread id 66, OS thread handle 0x7f2ca276e700, query id 165694346 localhost root
---TRANSACTION 5536D4B, not started
MySQL thread id 57, OS thread handle 0x7f2ca57ac700, query id 165749684 localhost root
---TRANSACTION 554309B, not started
MySQL thread id 50, OS thread handle 0x7f2ca5719700, query id 165746331 localhost root
---TRANSACTION 553D229, not started
MySQL thread id 51, OS thread handle 0x7f2ca279f700, query id 165709897 localhost root
---TRANSACTION 5542F95, ACTIVE 390 sec updating or deleting, thread declared inside InnoDB 479
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1248, 22 row lock(s), undo log entries 11
----------------------------
END OF INNODB MONITOR OUTPUT
============================
InnoDB: Warning: a long semaphore wait:
--Thread 139829732579072 has waited at row0row.c line 797 for 391.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7f2052a69550 '&block->lock'
a writer (thread id 139829732579072) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 0, lock_word: ffffffffffffffff
Last time read locked in file buf0flu.c line 1340
Last time write locked in file /tmp/Percona-Server-5.5.33/storage/innobase/row/row0row.c line 797
InnoDB: Warning: a long semaphore wait:
--Thread 139774700103424 has waited at btr0cur.c line 568 for 261.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7f1fcc13d438 '&new_index->lock'
a writer (thread id 139774700103424) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 0, lock_word: ffffffffffffffff
Last time read locked in file btr0cur.c line 575
Last time write locked in file /tmp/Percona-Server-5.5.33/storage/innobase/btr/btr0cur.c line 568
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 0, pwrites 0

=====================================
140219 15:47:25 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 4 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 60463 1_second, 60231 sleeps, 5727 10_second, 10889 background, 10889 flush
srv_master_thread log flush and writes: 73393
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 76450, signal count 214482
Mutex spin waits 197923, rounds 664046, OS waits 14904
RW-shared spins 87143, rounds 1651352, OS waits 36324
RW-excl spins 50238, rounds 1225994, OS waits 22676
Spin rounds per wait: 3.36 mutex, 18.95 RW-shared, 24.40 RW-excl
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
[...]
I/O thread 8 state: waiting for completed aio requests (read thread)
I/O thread 9 state: waiting for completed aio requests (read thread)InnoDB: ###### Diagnostic info printed to the standard error stream

I/O thread 10 state: waiting for completed aio requests (read thread)
I/O thread 11 state: waiting for completed aio requests (read thread)
I/O thread 12 state: waiting for completed aio requests (read thread)
I/O thread 13 state: waiting for completed aio requests (read thread)
I/O thread 14 state: waiting for completed aio requests (read thread)
I/O thread 15 state: waiting for completed aio requests (read thread)
[...]
Pending normal aio reads: 0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] ,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 1; buffer pool: 1
695551 OS file reads, 3125466 OS file writes, 129949 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.47 writes/s, 0.01 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 71463, seg size 71465, 3932 merges
merged operations:
 insert 36760, delete mark 205706, delete 16130
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 101998957, node heap has 301773 buffer(s)
257087.73 hash searches/s, 58088.98 non-hash searches/s
---
LOG
---
Log sequence number 2248271329055
Log flushed up to   2248254962791
Last checkpoint at  2247911015923
Max checkpoint age    1303175763
Checkpoint age target 1262451521
Modified age          358075228
Checkpoint age        360313132
1 pending log writes, 0 pending chkp writes
55149 log i/o's done, 0.01 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 52923727872; in additional pool allocated 0
Total memory allocated by read views 1952
Internal hash tables (constant factor + variable factor)
    Adaptive hash index 5760244704 	(815991656 + 4944253048)
    Page hash           51000296 (buffer pool 0 only)
    Dictionary cache    205373173 	(204000176 + 1372997)
    File system         83536 	(82672 + 864)
    Lock system         127507504 	(127499608 + 7896)
    Recovery system     0 	(0 + 0)
Dictionary memory allocated 1372997
Buffer pool size        3145727
Buffer pool size, bytes 51539591168
Free buffers            298639
Database pages          2545315
Old database pages      939559
Modified db pages       33328
Pending reads 0
Pending writes: LRU 0, flush list 1, single page 0
Pages made young 7209, not young 0
0.08 youngs/s, 0.00 non-youngs/s
Pages read 695540, created 1849775, written 3005110
0.00 reads/s, 0.07 creates/s, 0.46 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 2545315, unzip_LRU len: 0
I/O sum[0]:cur[75], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
0 transactions active inside InnoDB
0 out of 1000 descriptors used
---OLDEST VIEW---
Normal read view
Read view low limit trx n:o 55434EF
Read view up limit trx id 55434EF
Read view low limit trx id 55434EF
Read view individually stored trx ids:
-----------------
Main thread process no. 3838, id 139774708496128, state: flushing log
Number of rows inserted 140144555, updated 3507614, deleted 4482, read 612997358
0.00 inserts/s, 4690.83 updates/s, 0.00 deletes/s, 268924.27 reads/s
------------
TRANSACTIONS
------------
Trx id counter 5548CAC
Purge done for trx's n:o < 55434E9 undo n:o < 0
History list length 2599
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 5542EAB, not started
MySQL thread id 328, OS thread handle 0x7f2ca580e700, query id 165744796 localhost root
---TRANSACTION 55428DB, not started
MySQL thread id 333, OS thread handle 0x7f2ca20a4700, query id 165742661 localhost root
---TRANSACTION 55428BE, not started
MySQL thread id 332, OS thread handle 0x7f2ca2106700, query id 165742374 localhost root
---TRANSACTION 5548CAA, not started
...
----------------------------
END OF INNODB MONITOR OUTPUT
============================
 
Old 02-19-2014, 07:57 PM   #14
Emerson
LQ Sage
 
Registered: Nov 2004
Location: Saint Amant, Acadiana
Distribution: Gentoo ~amd64
Posts: 7,675

Rep: Reputation: Disabled
Looking at these outputs I'd say your drives are OK but your controller is bust.

Edit: May be the PS, too.

Last edited by Emerson; 02-19-2014 at 08:00 PM.
 
Old 02-19-2014, 07:59 PM   #15
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,050

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Emerson View Post
Looking at these outputs I'd say your drives are OK but your controller is bust.
That'd mean data corruption for sure? or can I be lucky and whatever mechanism would have "protected" me?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Timestamp dmesg entries loadedmind Linux - Server 3 06-26-2012 05:02 PM
crash - no log entries pbeau Linux - General 1 11-28-2011 07:04 PM
save dmesg after crash or reboot puntino Linux - Newbie 4 12-17-2010 07:52 PM
Running Server without monitor causes crash - help needed to identify cause sorenchr Linux - Server 3 04-19-2010 08:54 PM
wierd dmesg entries slinky2004 Linux - General 5 10-10-2005 11:20 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 06:11 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration