Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux? |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
04-09-2019, 06:42 PM
|
#1
|
Member
Registered: Apr 2010
Distribution: FC17
Posts: 380
Rep:
|
smartctl/gnome-disks questions about SMART info
Hello!
I attempted to test my disks using the "gnome-disks" utility, in particular I wanted
to see the SMART info. However, the SMART lines of the menu were grayed out.
I also attempted to directly use "smartctl" but got only a laconic "health status OK"
, without any detailed info which I saw when googled about "smartctl" usage.
Here's the output of "lsscsi":
Code:
<root localhost.localdomain>.../root>lsscsi
[1:0:0:0] cd/dvd ASUS DRW-24D5MT 1.00 /dev/sr0
[6:0:3:0] tape HP Ultrium 1-SCSI N2CG /dev/st0
[7:0:0:0] disk ICP KA_RAID_0 V1.0 /dev/sda
[7:0:1:0] disk ICP KA_RAID_2 V1.0 /dev/sdb
[7:0:2:0] disk ICP KA_RAID_1 V1.0 /dev/sdc
[7:1:0:0] disk FUJITSU MAU3147RC 0104 -
[7:1:1:0] disk FUJITSU MBA3147RC HPF1 -
[7:1:2:0] disk TOSHIBA MG04SCA20EE 0104 -
[7:1:3:0] disk TOSHIBA MG04SCA20EE 0104 -
[7:1:4:0] disk TOSHIBA MK2001TRKB 0105 -
[7:1:5:0] disk TOSHIBA MK2001TRKB 0105 -
The RAID controller is Adaptec ICP5165BR, the arrays are hardware RAID-0.
The /dev/sda is a pair of "FUJITSU MAU3147RC" disks,
the /dev/sdb is a pair of "TOSHIBA MG04SCA20EE" disks,
the /dev/sdc is a pair of "TOSHIBA MK2001TRKB" disks.
I'm running FC29+KDE,
Code:
<root localhost.localdomain>.../root>uname -r
5.0.5-200.spi_gpio.fc29.x86_64
<root localhost.localdomain>.../root>rpm -qa | grep smart
libatasmart-0.19-15.fc29.x86_64
smartmontools-6.6-5.fc29.x86_64
libsmartcols-2.32.1-1.fc29.x86_64
Asking "smartctrl" for a detailed info:
Code:
<root localhost.localdomain>.../root>smartctl -a /dev/sdc
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.0.5-200.spi_gpio.fc29.x86_64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: ICP
Product: KA_RAID_1
Revision: V1.0
User Capacity: 3,994,319,585,280 bytes [3.99 TB]
Logical block size: 512 bytes
Logical Unit id: 0x7c395fd600d00000
Serial number: D65F397C
Device type: disk
Local Time is: Wed Apr 10 02:20:55 2019 IDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Disabled or Not Supported
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 0 C
Drive Trip Temperature: 0 C
Error Counter logging not supported
Device does not support Self Test logging
And attempting to run a test:
Code:
<root localhost.localdomain>.../root>smartctl -t short /dev/sdc
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.0.5-200.spi_gpio.fc29.x86_64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
Short offline self test failed [unsupported scsi opcode]
The contents of the /etc/smartmontools/smartd.conf are:
Code:
# Sample configuration file for smartd. See man smartd.conf.
# Home page is: http://www.smartmontools.org
# $Id: smartd.conf 4120 2015-08-27 16:12:21Z samm2 $
# smartd will re-read the configuration file if it receives a HUP
# signal
# The file gives a list of devices to monitor using smartd, with one
# device per line. Text after a hash (#) is ignored, and you may use
# spaces and tabs for white space. You may use '\' to continue lines.
# You can usually identify which hard disks are on your system by
# looking in /proc/ide and in /proc/scsi.
# The word DEVICESCAN will cause any remaining lines in this
# configuration file to be ignored: it tells smartd to scan for all
# ATA and SCSI devices. DEVICESCAN may be followed by any of the
# Directives listed below, which will be applied to all devices that
# are found. Most users should comment out DEVICESCAN and explicitly
# list the devices that they wish to monitor.
DEVICESCAN -H -m root -M exec /usr/libexec/smartmontools/smartdnotify -n standby,10,q
# Alternative setting to ignore temperature and power-on hours reports
# in syslog.
#DEVICESCAN -I 194 -I 231 -I 9
# Alternative setting to report more useful raw temperature in syslog.
#DEVICESCAN -R 194 -R 231 -I 9
# Alternative setting to report raw temperature changes >= 5 Celsius
# and min/max temperatures.
#DEVICESCAN -I 194 -I 231 -I 9 -W 5
# First ATA/SATA or SCSI/SAS disk. Monitor all attributes, enable
# automatic online data collection, automatic Attribute autosave, and
# start a short self-test every day between 2-3am, and a long self test
# Saturdays between 3-4am.
#/dev/sda -a -o on -S on -s (S/../.././02|L/../../6/03)
# Monitor SMART status, ATA Error Log, Self-test log, and track
# changes in all attributes except for attribute 194
#/dev/sdb -H -l error -l selftest -t -I 194
# Monitor all attributes except normalized Temperature (usually 194),
# but track Temperature changes >= 4 Celsius, report Temperatures
# >= 45 Celsius and changes in Raw value of Reallocated_Sector_Ct (5).
# Send mail on SMART failures or when Temperature is >= 55 Celsius.
#/dev/sdc -a -I 194 -W 4,45,55 -R 5 -m admin@example.com
# An ATA disk may appear as a SCSI device to the OS. If a SCSI to
# ATA Translation (SAT) layer is between the OS and the device then
# this can be flagged with the '-d sat' option. This situation may
# become common with SATA disks in SAS and FC environments.
# /dev/sda -a -d sat
# A very silent check. Only report SMART health status if it fails
# But send an email in this case
#/dev/sdc -H -C 0 -U 0 -m admin@example.com
# First two SCSI disks. This will monitor everything that smartd can
# monitor. Start extended self-tests Wednesdays between 6-7pm and
# Sundays between 1-2 am
#/dev/sda -d scsi -s L/../../3/18
#/dev/sdb -d scsi -s L/../../7/01
# Monitor 4 ATA disks connected to a 3ware 6/7/8000 controller which uses
# the 3w-xxxx driver. Start long self-tests Sundays between 1-2, 2-3, 3-4,
# and 4-5 am.
# NOTE: starting with the Linux 2.6 kernel series, the /dev/sdX interface
# is DEPRECATED. Use the /dev/tweN character device interface instead.
# For example /dev/twe0, /dev/twe1, and so on.
#/dev/sdc -d 3ware,0 -a -s L/../../7/01
#/dev/sdc -d 3ware,1 -a -s L/../../7/02
#/dev/sdc -d 3ware,2 -a -s L/../../7/03
#/dev/sdc -d 3ware,3 -a -s L/../../7/04
# Monitor 2 ATA disks connected to a 3ware 9000 controller which
# uses the 3w-9xxx driver (Linux, FreeBSD). Start long self-tests Tuesdays
# between 1-2 and 3-4 am.
#/dev/twa0 -d 3ware,0 -a -s L/../../2/01
#/dev/twa0 -d 3ware,1 -a -s L/../../2/03
# Monitor 2 SATA (not SAS) disks connected to a 3ware 9000 controller which
# uses the 3w-sas driver (Linux). Start long self-tests Tuesdays
# between 1-2 and 3-4 am.
# On FreeBSD /dev/tws0 should be used instead
#/dev/twl0 -d 3ware,0 -a -s L/../../2/01
#/dev/twl0 -d 3ware,1 -a -s L/../../2/03
# Same as above for Windows. Option '-d 3ware,N' is not necessary,
# disk (port) number is specified in device name.
# NOTE: On Windows, DEVICESCAN works also for 3ware controllers.
#/dev/hdc,0 -a -s L/../../2/01
#/dev/hdc,1 -a -s L/../../2/03
# Monitor 3 ATA disks directly connected to a HighPoint RocketRAID. Start long
# self-tests Sundays between 1-2, 2-3, and 3-4 am.
#/dev/sdd -d hpt,1/1 -a -s L/../../7/01
#/dev/sdd -d hpt,1/2 -a -s L/../../7/02
#/dev/sdd -d hpt,1/3 -a -s L/../../7/03
# Monitor 2 ATA disks connected to the same PMPort which connected to the
# HighPoint RocketRAID. Start long self-tests Tuesdays between 1-2 and 3-4 am
#/dev/sdd -d hpt,1/4/1 -a -s L/../../2/01
#/dev/sdd -d hpt,1/4/2 -a -s L/../../2/03
# HERE IS A LIST OF DIRECTIVES FOR THIS CONFIGURATION FILE.
# PLEASE SEE THE smartd.conf MAN PAGE FOR DETAILS
#
# -d TYPE Set the device type: ata, scsi, marvell, removable, 3ware,N, hpt,L/M/N
# -T TYPE set the tolerance to one of: normal, permissive
# -o VAL Enable/disable automatic offline tests (on/off)
# -S VAL Enable/disable attribute autosave (on/off)
# -n MODE No check. MODE is one of: never, sleep, standby, idle
# -H Monitor SMART Health Status, report if failed
# -l TYPE Monitor SMART log. Type is one of: error, selftest
# -f Monitor for failure of any 'Usage' Attributes
# -m ADD Send warning email to ADD for -H, -l error, -l selftest, and -f
# -M TYPE Modify email warning behavior (see man page)
# -s REGE Start self-test when type/date matches regular expression (see man page)
# -p Report changes in 'Prefailure' Normalized Attributes
# -u Report changes in 'Usage' Normalized Attributes
# -t Equivalent to -p and -u Directives
# -r ID Also report Raw values of Attribute ID with -p, -u or -t
# -R ID Track changes in Attribute ID Raw value with -p, -u or -t
# -i ID Ignore Attribute ID for -f Directive
# -I ID Ignore Attribute ID for -p, -u or -t Directive
# -C ID Report if Current Pending Sector count non-zero
# -U ID Report if Offline Uncorrectable count non-zero
# -W D,I,C Monitor Temperature D)ifference, I)nformal limit, C)ritical limit
# -v N,ST Modifies labeling of Attribute N (see man page)
# -a Default: equivalent to -H -f -t -l error -l selftest -C 197 -U 198
# -F TYPE Use firmware bug workaround. Type is one of: none, samsung
# -P TYPE Drive-specific presets: use, ignore, show, showall
# # Comment: text after a hash sign is ignored
# \ Line continuation character
# Attribute ID is a decimal integer 1 <= ID <= 255
# except for -C and -U, where ID = 0 turns them off.
# All but -d, -m and -M Directives are only implemented for ATA devices
#
# If the test string DEVICESCAN is the first uncommented text
# then smartd will scan for devices.
# DEVICESCAN may be followed by any desired Directives.
Does the lack of a detailed SMART info about the disk (and inability to run a test)
is a function of a disk model (I've got the same about relatively new
"TOSHIBA MG04SCA20EE" disk) or I didn't configure something correctly?
TIA,
kaza.
|
|
|
04-10-2019, 12:21 AM
|
#2
|
Member
Registered: Apr 2010
Distribution: FC17
Posts: 380
Original Poster
Rep:
|
OK, I found the cause: I was trying to get info of RAID arrays (/dev/sda, /dev/sdb,/dev/sdc)
while I should've test individual disks:
Code:
<root localhost.localdomain>.../root>ls /dev | grep sg
drwxr-xr-x. 2 root root 260 Apr 10 2019 06:51:57 bsg
crw-r--r--. 1 root root 1, 11 Apr 10 2019 06:51:58 kmsg
crw-rw----+ 1 root cdrom 21, 0 Apr 10 2019 06:51:58 sg0
crw-rw----. 1 root disk 21, 1 Apr 10 2019 06:51:58 sg1
crw-rw----. 1 root tape 21, 10 Apr 10 2019 06:51:58 sg10
crw-rw----. 1 root disk 21, 2 Apr 10 2019 06:51:58 sg2
crw-rw----. 1 root disk 21, 3 Apr 10 2019 06:51:58 sg3
crw-rw----. 1 root disk 21, 4 Apr 10 2019 06:51:58 sg4
crw-rw----. 1 root disk 21, 5 Apr 10 2019 06:51:58 sg5
crw-rw----. 1 root disk 21, 6 Apr 10 2019 06:51:58 sg6
crw-rw----. 1 root disk 21, 7 Apr 10 2019 06:51:58 sg7
crw-rw----. 1 root disk 21, 8 Apr 10 2019 06:51:58 sg8
crw-rw----. 1 root disk 21, 9 Apr 10 2019 06:51:58 sg9
Here's status of one of the new disks:
Code:
<root localhost.localdomain>.../root>smartctl -a /dev/sg6
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.0.5-200.spi_gpio.fc29.x86_64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: TOSHIBA
Product: MG04SCA20EE
Revision: 0104
Compliance: SPC-4
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x50000398f86849ad
Serial number: X8L0A02EFX2B
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Wed Apr 10 07:59:59 2019 IDT
SMART support is: Available - device has SMART capability.
SMART support is: Disabled
Temperature Warning: Disabled or Not Supported
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 30 C
Drive Trip Temperature: 65 C
Manufactured in week 42 of year 2018
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 86
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 583
Elements in grown defect list: 0
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 1 0 0 0 85826.887 0
write: 0 0 0 0 0 777.294 0
Non-medium error count: 86
No self-tests have been logged
and here's the status of each of the old disks:
Code:
<root localhost.localdomain>.../root>smartctl -a /dev/sg8
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.0.5-200.spi_gpio.fc29.x86_64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: TOSHIBA
Product: MK2001TRKB
Revision: 0105
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Logical block size: 512 bytes
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x50000393a8c9b238
Serial number: Y1P0A049FM13
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Wed Apr 10 08:01:55 2019 IDT
SMART support is: Available - device has SMART capability.
SMART support is: Disabled
Temperature Warning: Disabled or Not Supported
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 36 C
Drive Trip Temperature: 65 C
Manufactured in week 47 of year 2011
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 4029
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 19377
Elements in grown defect list: 438
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 5006 947 0 0 1268695.133 947
write: 0 0 0 0 0 21475.496 0
verify: 0 0 0 0 0 4001.524 0
Non-medium error count: 4489
No self-tests have been logged
<root localhost.localdomain>.../root>smartctl -a /dev/sg9
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.0.5-200.spi_gpio.fc29.x86_64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: TOSHIBA
Product: MK2001TRKB
Revision: 0105
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Logical block size: 512 bytes
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x50000393a8c9b230
Serial number: Y1P0A048FM13
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Wed Apr 10 08:04:35 2019 IDT
SMART support is: Available - device has SMART capability.
SMART support is: Disabled
Temperature Warning: Disabled or Not Supported
=== START OF READ SMART DATA SECTION ===
SMART Health Status: HARDWARE IMPENDING FAILURE DATA ERROR RATE TOO HIGH [asc=5d, ascq=12]
Current Drive Temperature: 33 C
Drive Trip Temperature: 65 C
Manufactured in week 47 of year 2011
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 4039
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 22631
Elements in grown defect list: 888
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 5780 112 0 0 630078.900 112
write: 0 0 0 0 0 21057.161 0
verify: 0 0 0 0 0 4001.524 0
Non-medium error count: 4487
No self-tests have been logged
How should I interpret the: "SMART Health Status: HARDWARE IMPENDING FAILURE DATA ERROR RATE TOO HIGH [asc=5d, ascq=12]" ?
does it means "disk may probably fail any minute" or "status to monitor a change, if starts to change - replace disk"?
BTW, does the "mapping of names" from /dev/sgN to arrays /dev/sdM is constant or it changes every reboot?
I would like to automate the checking of all of the disks and would like to know if I can rely on constant assignment
of these numbers (for easy "diff" of log files named by "N") or I should parse each result, get the disk ID and name
the log files according to this ID?
TIA,
kaza.
TIA,
kaza.
|
|
|
All times are GMT -5. The time now is 09:04 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|