LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Other *NIX Forums > Solaris / OpenSolaris
User Name
Password
Solaris / OpenSolaris This forum is for the discussion of Solaris and OpenSolaris.
General Sun, SunOS and Sparc related questions also go here.

Notices

Reply
 
Search this Thread
Old 01-23-2013, 01:48 PM   #1
alpha01
Member
 
Registered: Jul 2008
Location: Orange County
Distribution: Ubuntu/Debian, CentOS, RHEL, FreeBSD, OS X
Posts: 75

Rep: Reputation: 19
Solaris 11 machine crashed, possible hardware issue?


Hello,

I have a Solaris 11 machine that randomly crashed this morning. After physically restarting the machine, I noticed that all of the drives were marked with a "Sense Key: Soft_Error" both in dmesg and in /var/adm/messages.

Since all the drives on the machine were tagged with the same Soft Error, does this mean that the HBA is faulty?

Code:
root@solaris-machine:/var/log# iostat -E
sd0       Soft Errors: 1 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product:       Revision: SN02 Serial No: 
Size: 500.11GB <500107862016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 1
Illegal Request: 12 Predictive Failure Analysis: 0
sd2       Soft Errors: 1 Hard Errors: 0 Transport Errors: 0 
Vendor: ATA      Product:      Revision: 0004 Serial No:  
Size: 3000.59GB <3000592982016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 1
Illegal Request: 0 Predictive Failure Analysis: 0 
sd4       Soft Errors: 1 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product:      Revision: 0004 Serial No:  
Size: 3000.59GB <3000592982016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 1 
Illegal Request: 0 Predictive Failure Analysis: 0 
sd5       Soft Errors: 1 Hard Errors: 0 Transport Errors: 0 
Vendor: ATA      Product:      Revision: 0004 Serial No: 
Size: 3000.59GB <3000592982016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 1 
Illegal Request: 0 Predictive Failure Analysis: 0
Code:
Jan 23 10:45:02 solaris-machine scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c5004dfae642 (sd4):
Jan 23 10:45:02 solaris-machine      Error for Command: <undecoded cmd 0xa1>    Error Level: Recovered
Jan 23 10:45:02 solaris-machine scsi: [ID 107833 kern.notice]        Requested Block: 0                         Error Block: 0
Jan 23 10:45:02 solaris-machine scsi: [ID 107833 kern.notice]        Vendor: ATA                                Serial Number:        
Jan 23 10:45:02 solaris-machine scsi: [ID 107833 kern.notice]        Sense Key: Soft_Error
Jan 23 10:45:04 solaris-machine scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c5004dfc8db2 (sd2):
Jan 23 10:45:04 solaris-machine      Error for Command: <undecoded cmd 0xa1>    Error Level: Recovered
Jan 23 10:45:04 solaris-machine scsi: [ID 107833 kern.notice]        Requested Block: 0                         Error Block: 0
Jan 23 10:45:04 solaris-machine scsi: [ID 107833 kern.notice]        Vendor: ATA                                Serial Number:        
Jan 23 10:45:04 solaris-machine scsi: [ID 107833 kern.notice]        Sense Key: Soft_Error
Jan 23 10:45:04 solaris-machine scsi: [ID 107833 kern.notice]        ASC: 0x0 (<vendor unique code 0x0>), ASCQ: 0x1d, FRU: 0x0
Jan 23 10:45:04 solaris-machine scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c5004dfd4ce3 (sd5):
Jan 23 10:45:04 solaris-machine      Error for Command: <undecoded cmd 0xa1>    Error Level: Recovered
Jan 23 10:45:04 solaris-machine scsi: [ID 107833 kern.notice]        Requested Block: 0                         Error Block: 0
Jan 23 10:45:04 solaris-machine scsi: [ID 107833 kern.notice]        Vendor: ATA                                Serial Number:
Jan 23 10:45:04 solaris-machine scsi: [ID 107833 kern.notice]        Sense Key: Soft_Error
Jan 23 10:45:04 solaris-machine scsi: [ID 107833 kern.notice]        ASC: 0x0 (<vendor unique code 0x0>), ASCQ: 0x1d, FRU: 0x0
Jan 23 10:45:07 solaris-machine scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci15d9,664@1f,2/disk@0,0 (sd0):
Jan 23 10:45:07 solaris-machine      Error for Command: <undecoded cmd 0xa1>    Error Level: Recovered
Jan 23 10:45:07 solaris-machine scsi: [ID 107833 kern.notice]        Requested Block: 0                         Error Block: 0
Jan 23 10:45:07 solaris-machine scsi: [ID 107833 kern.notice]        Vendor: ATA                                Serial Number:
Jan 23 10:45:07 solaris-machine scsi: [ID 107833 kern.notice]        Sense Key: Soft_Error
Jan 23 10:45:07 solaris-machine scsi: [ID 107833 kern.notice]        ASC: 0x0 (no additional sense info), ASCQ: 0x0, FRU: 0x0

Last edited by alpha01; 01-25-2013 at 01:54 PM.
 
Old 01-24-2013, 01:52 AM   #2
DukeNuke2
LQ Newbie
 
Registered: Jul 2009
Distribution: Solaris10, Solaris11
Posts: 2

Rep: Reputation: 0
What hardware do you use? I don't think of an HBA error right away. The interesting question is more like "why does the machine crash" in the first place... also, did you check the zpool/zfs status of the drives?
 
Old 01-24-2013, 03:52 PM   #3
alpha01
Member
 
Registered: Jul 2008
Location: Orange County
Distribution: Ubuntu/Debian, CentOS, RHEL, FreeBSD, OS X
Posts: 75

Original Poster
Rep: Reputation: 19
I'm using standard x86 hardware.
Code:
ID    SIZE TYPE
1     113  SMB_TYPE_SYSTEM (system information)

  Manufacturer: Supermicro
  Product: X9DRH-7TF/7F/iTF/iF
  Version: 1234567890
I forgot to mentioned on my original post, I checked all zfs pools after the reboot and they all appeared to be in optimal condition. All drives on the other hand, had the Soft Error recoverable tagged on them.
 
Old 01-25-2013, 07:59 AM   #4
DukeNuke2
LQ Newbie
 
Registered: Jul 2009
Distribution: Solaris10, Solaris11
Posts: 2

Rep: Reputation: 0
i wouldn't give to much about the errors from iostst output... but again, what was the root cause of the crash?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to find Machine serial number of solaris machine? adastane Solaris / OpenSolaris 15 01-09-2012 08:29 AM
WEP crashed machine piercy007 Linux - Wireless Networking 1 11-25-2007 09:39 AM
Cannot telnet into linux machine from Solaris machine ngcddls Linux - Newbie 1 03-09-2006 08:07 AM
SuSE installer crashed on hardware confiuguration statge : no sound dasy2k1 Suse/Novell 1 01-21-2006 05:54 PM


All times are GMT -5. The time now is 03:45 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration