Hi
we are a company running many ubuntu server 14.04 installations on our customers sites.
The servers are used as storage servers for video surveillance. So the main task for the servers is to run a samba (smbd) share.
There is also a mdadm software raid configured (4 disks; raid5) with ext4 filesystem.
Some of the servers are crashing randomly. The system becomes totaly unresponsive after the crash and only a hard reset can bring the server back to normal. The screen freezes and the keyboard is not working. Also all network services stop working.
We migrated one of our customers servers to ubuntu server 16.04 with no luck. Still the same error.
The ubuntu 16 installation stops working after one kernel oops:
Code:
Aug 2 10:03:16 kernel: [86744.272395] BUG: unable to handle kernel paging request at 00000000746500f0
Aug 2 10:03:16 kernel: [86744.272470] IP: [<ffffffff81260648>] __posix_lock_file+0x108/0x710
Aug 2 10:03:16 kernel: [86744.272530] PGD 0
Aug 2 10:03:16 kernel: [86744.272551] Oops: 0000 [#1] SMP
The ubuntu 14 installation has the same kernel oops (different trace) followed by the following crash:
Code:
Aug 3 16:35:50 kernel: [29663.689228] WARNING: CPU: 1 PID: 28296 at /build/linux-glt4zk/linux-3.13.0/kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xd0()
Aug 3 16:35:50 kernel: [29663.691465] Watchdog detected hard LOCKUP on cpu 1
The kernel logs are attached.
The is the hardware setup:
MB: Supermicro X10SBA (onboard intel celeron j1900)
RAM: Kingston 4GB KVR16LS11/4 (2x)
OS drive: 30GB mSATA Kingston
HDDs: WD Enterprise (on 14.04) x4 / WD Purple (on 16.04) x4
The drives are connected via mini sas backplane.
Any idea how to find out what the problem is?