Hi,
The hardware is a standard off-the-shelf Server with an Adaptec AHA-2944UW driving an IBM 3583 library.
The timeout error itself does not worry me too much.
What worries me is the fact that when a timeout occurs, the device goes offline and becomes unavailable. I haven’t found yet how to bring the device back online with a command/program.
Currently I must reboot the Server, which is not very funny when the problem happens over the weekend and my backups are not performed.
I designed a driver to drive the library and when a timeout error occurs, the library becomes unavailable. Any reset I tried does not re-enable the device.
After a timeout, here is what happens (the device is still there but it is unavailable):
# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 06 Lun: 00
Vendor: IBM Model: ULT3583-TL Rev: 2.80
Type: Medium Changer ANSI SCSI revision: 02
I have a little program that sends a TetUnitReady to the device:
testscsi <dev> <LUN> <command> where the command is a TestUnitReady
# ./testscsi /dev/sga 0 0
[E] Unable to open SCSI device /dev/sga, errno=6 (scsilib/SCSIOpenDevice).
[E] Unable to connect to device /dev/sga (testscsi/main).
I get back an error 6 which means the following:
#define ENXIO - 6 - No such device or address
According to some Linux doc I found on internet linux forums (
http://www.linuxforum.com/linux-scsi/x215.html), putting the device offline may be a normal behaviour, but it does not help me.
Anyway, if someone has a “workaround” to reset the device online without rebooting, this would be great.
RX100