Ubuntu 14.04 server crash - Detected aborted journal
My production Ubuntu 14.04 server, running on VMware ESXi with storage on LUN on a NAS, just crashed. Likely a powerout of the NAS causing the IO errors to storage.
I did not succeed in getting it up an running yet, and I could use any help available.
I made a Snapshot with memory before doing anything.
Below output of dmesg, clearly showing the start of the prolbem. To me this is a bit technical, but hopefully any of you 'speak this crashs language' better..
====== dmesg ===
[2912120.223885] INFO: task khugepaged:69 blocked for more than 120 seconds.
[2912120.224745] Not tainted 4.2.0-30-generic #36~14.04.1-Ubuntu
[2912120.225498] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2912120.226267] khugepaged D ffff88023fd16640 0 69 2 0x00000000
[2912120.226271] ffff880235d8fc50 0000000000000046 ffff880236273e80 ffff880235c5b200
[2912120.226274] ffff880235c5b268 ffff880235d90000 ffff8802333c0ba8 ffff8802333c0bc0
[2912120.226275] ffffffff81fbffc4 0000000000000000 ffff880235d8fc70 ffffffff817b8ac7
[2912120.226277] Call Trace:
[2912120.226286] [<ffffffff817b8ac7>] schedule+0x37/0x80
[2912120.226291] [<ffffffff8101dd39>] ? sched_clock+0x9/0x10
[2912120.226293] [<ffffffff817bb1f0>] rwsem_down_read_failed+0xe0/0x120
[2912120.226296] [<ffffffff813b37f4>] call_rwsem_down_read_failed+0x14/0x30
[2912120.226298] [<ffffffff817ba874>] ? down_read+0x24/0x30
[2912120.226302] [<ffffffff811d580d>] khugepaged_scan_mm_slot+0x6d/0xf30
[2912120.226303] [<ffffffff817bb405>] ? schedule_timeout+0x165/0x2a0
[2912120.226307] [<ffffffff810de9d0>] ? trace_event_raw_event_tick_stop+0xc0/0xc0
[2912120.226309] [<ffffffff811d67ed>] khugepaged+0x11d/0x440
[2912120.226311] [<ffffffff817b8401>] ? __schedule+0x261/0x8f0
[2912120.226314] [<ffffffff810b7670>] ? prepare_to_wait_event+0xf0/0xf0
[2912120.226316] [<ffffffff811d66d0>] ? khugepaged_scan_mm_slot+0xf30/0xf30
[2912120.226319] [<ffffffff81095499>] kthread+0xc9/0xe0
[2912120.226321] [<ffffffff810953d0>] ? kthread_create_on_node+0x1c0/0x1c0
[2912120.226322] [<ffffffff817bc75f>] ret_from_fork+0x3f/0x70
[2912120.226324] [<ffffffff810953d0>] ? kthread_create_on_node+0x1c0/0x1c0
[2912120.226330] INFO: task jbd2/dm-0-8:226 blocked for more than 120 seconds.
[2912120.227107] Not tainted 4.2.0-30-generic #36~14.04.1-Ubuntu
[2912120.227910] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2912120.228701] jbd2/dm-0-8 D 0000000000000000 0 226 2 0x00000000
[2912120.228704] ffff8802325fba78 0000000000000046 ffff880236369900 ffff880036614b00
[2912120.228705] ffff88023434f980 ffff8802325fc000 0000000000000000 7fffffffffffffff
[2912120.228707] ffff88023ffdc9a0 ffffffff817b9240 ffff8802325fba98 ffffffff817b8ac7
[2912120.228708] Call Trace:
[2912120.228712] [<ffffffff817b9240>] ? bit_wait+0x60/0x60
[2912120.228714] [<ffffffff817b8ac7>] schedule+0x37/0x80
[2912120.228715] [<ffffffff817bb4a1>] schedule_timeout+0x201/0x2a0
[2912120.228718] [<ffffffff8101d749>] ? read_tsc+0x9/0x10
[2912120.228721] [<ffffffff810e6e5e>] ? ktime_get+0x3e/0xa0
[2912120.228722] [<ffffffff8101d749>] ? read_tsc+0x9/0x10
[2912120.228724] [<ffffffff810e6e5e>] ? ktime_get+0x3e/0xa0
[2912120.228726] [<ffffffff817b9240>] ? bit_wait+0x60/0x60
[2912120.228729] [<ffffffff817b8136>] io_schedule_timeout+0xa6/0x110
[2912120.228731] [<ffffffff817b925f>] bit_wait_io+0x1f/0x70
[2912120.228733] [<ffffffff817b8e90>] __wait_on_bit+0x60/0x90
[2912120.228736] [<ffffffff81375eae>] ? queue_unplugged+0x2e/0xa0
[2912120.228738] [<ffffffff817b9240>] ? bit_wait+0x60/0x60
[2912120.228740] [<ffffffff817b8f32>] out_of_line_wait_on_bit+0x72/0x80
[2912120.228741] [<ffffffff810b76b0>] ? autoremove_wake_function+0x40/0x40
[2912120.228746] [<ffffffff8121f046>] __wait_on_buffer+0x36/0x40
[2912120.228748] [<ffffffff812be1b7>] jbd2_journal_commit_transaction+0xd37/0x1810
[2912120.228751] [<ffffffff810dee1f>] ? try_to_del_timer_sync+0x4f/0x70
[2912120.228753] [<ffffffff812c23db>] kjournald2+0xbb/0x230
[2912120.228755] [<ffffffff810b7670>] ? prepare_to_wait_event+0xf0/0xf0
[2912120.228756] [<ffffffff812c2320>] ? commit_timeout+0x10/0x10
[2912120.228758] [<ffffffff81095499>] kthread+0xc9/0xe0
[2912120.228760] [<ffffffff810953d0>] ? kthread_create_on_node+0x1c0/0x1c0
[2912120.228762] [<ffffffff817bc75f>] ret_from_fork+0x3f/0x70
[2912120.228763] [<ffffffff810953d0>] ? kthread_create_on_node+0x1c0/0x1c0
[2912120.228770] INFO: task postgres:4563 blocked for more than 120 seconds.
[2912120.229545] Not tainted 4.2.0-30-generic #36~14.04.1-Ubuntu
[2912120.230312] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2912120.231078] postgres D ffff88023fdd6640 0 4563 4561 0x00000000
[2912120.231080] ffff880234eb7b78 0000000000000086 ffff880236276400 ffff880232416400
[2912120.231082] 0000000000000000 ffff880234eb8000 0000000000000000 7fffffffffffffff
[2912120.231083] ffff88023ffcd0e8 ffffffff817b9240 ffff880234eb7b98 ffffffff817b8ac7
[2912120.231085] Call Trace:
[2912120.231087] [<ffffffff817b9240>] ? bit_wait+0x60/0x60
[2912120.231089] [<ffffffff817b8ac7>] schedule+0x37/0x80
[2912120.231091] [<ffffffff817bb4a1>] schedule_timeout+0x201/0x2a0
[2912120.231093] [<ffffffff8101d749>] ? read_tsc+0x9/0x10
[2912120.231095] [<ffffffff810e6e5e>] ? ktime_get+0x3e/0xa0
[2912120.231097] [<ffffffff817b9240>] ? bit_wait+0x60/0x60
[2912120.231099] [<ffffffff817b8136>] io_schedule_timeout+0xa6/0x110
[2912120.231102] [<ffffffff817b925f>] bit_wait_io+0x1f/0x70
[2912120.231104] [<ffffffff817b8e90>] __wait_on_bit+0x60/0x90
[2912120.231107] [<ffffffff81176708>] ? find_get_pages_tag+0xc8/0x170
[2912120.231110] [<ffffffff81175980>] wait_on_page_bit+0xc0/0xd0
[2912120.231111] [<ffffffff810b76b0>] ? autoremove_wake_function+0x40/0x40
[2912120.231113] [<ffffffff81175a80>] filemap_fdatawait_range+0xf0/0x180
[2912120.231115] [<ffffffff81177695>] ? __filemap_fdatawrite_range+0xb5/0xf0
[2912120.231118] [<ffffffff811777df>] filemap_write_and_wait_range+0x3f/0x70
[2912120.231121] [<ffffffff812698b1>] ext4_sync_file+0xb1/0x2e0
[2912120.231124] [<ffffffff8121bd5d>] vfs_fsync_range+0x3d/0xb0
[2912120.231125] [<ffffffff8121be2d>] do_fsync+0x3d/0x70
|