Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
We have recently been experiencing a problem with a Linux server which is used as a NetBackup media server and disk cache.
The server is a Dell R710 with an attached MD3000 disk array running SuSE 10.2 x64. The array is divided into two LUNS, the larger of which is further partitioned into two volumes. Each of the 3 volumes on this disk is formatted with reiserfs.
During the backup run over the last two days the partition on the smaller LUN, and the larger partition on the larger LUN have become unresponsive. No writes are being performed to the filesystems and backup processes attempting to write seem to be hung. When running iostat I can see that there apears to be up to 900 read transactions per second on the disks but no write transactions. However the smaller partition on the largest LUN is unaffected. You also cannot get a folder listing while this is happening. Attempting to kill any backup "bptm" processes using kill -9 also fails if they are attempting to write to these areas.
There are no errors reported in the messages log, nor are there hardware errors reported on the array.
I have tried running fsck on both affected filesystems but this reports back as clean after about 9 hours of processing on each filesystem.
The affected filesystems are approximately 9TB and 7.5TB in size.
Any suggestions on how to resolve this would be greatly appreciated.