Hi,
We are experiencing several crashes on one of our servers running FC12 (2.6.31.12-174.2.22.fc12.x86_64 #1 SMP). We had similar issues with kernels 2.6.31.12-174.2.3.fc12.x86_64 and 2.6.31.5-127.fc12.x86_64, but now the frequency is increasing from once every 10 days or so, to about once or twice a day, seemingly without a great deal of dependence on CPU/memory load or excessive disk I/O.
We enabled serial logging and found that the server was crashing with messages of the form "Kernel panic - not syncing: xfs_fs_destroy_inode: cannot reclaim <some hex address>". A quick search suggests that other people have experienced this kind of problem with kernels 2.6.29 and 2.6.30, and some patches were provided, but these did not appear to fix the problem.
The server consists of a RAID 1 array of 2x500Gb disks which contains the OS, and two RAID 6 arrays of 12x1Tb disks glued together with LVM, which contains our data. The output of xfs_info for the data partition is:
Code:
meta-data=/dev/mapper/main-userspace isize=256 agcount=16, agsize=268435455 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=4294967280, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=32768, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
It would be fantastic if anybody could suggest anything we might try in order to alleviate this problem. Please let us know if your require any further information.
Thanks.
Best wishes,
Alastair.