LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Syndicated Linux News (https://www.linuxquestions.org/questions/syndicated-linux-news-67/)
-   -   LXer: SSD Failure Temporarily Halts Linux 3.12 Kernel Work (https://www.linuxquestions.org/questions/syndicated-linux-news-67/lxer-ssd-failure-temporarily-halts-linux-3-12-kernel-work-4175476697/)

LXer 09-11-2013 06:11 AM

LXer: SSD Failure Temporarily Halts Linux 3.12 Kernel Work
 
Published at LXer:

The failure of a solid-state drive in Linus Torvalds' main workstation has led to new activity during the Linux 3.12 kernel merge window being temporarily suspended...

Read More...

Jeebizz 09-11-2013 07:27 PM

Interesting - glad I didn't choose Intel for my SSD. Probably doesn't mean much actually - though I think the best idea if you are going to use an SSD as part of your setup, only keep the main system on that and run your /home on a conventional drive, thats what I do. I'm sure his compile times were amazing, but until SSDs have better read/write cycles, it would be best to keep the read or most likely the writes to a minimum as possible to extend life of an SSD.

TobiSGD 09-11-2013 07:41 PM

Fun fact: I use an Intel SSD for a couple of years now (10837 power-on hours, about 3.5 TB written to it, lifetime still 97%).
One dying SSD is as good as indicator for quality and lifetime as is my SSD that works so good without any flaws: It says exactly nothing.

Jeebizz 09-11-2013 08:57 PM

Curious, how do you check the lifespan? Also what FS do you have on it?

TobiSGD 09-11-2013 09:05 PM

Lifespan can be controlled with smartctl, this is how it looks on my Intel:
Code:

233 Media_Wearout_Indicator 0x0032  097  097  000    Old_age  Always      -      0
Started at 100 when the SSD was new.
And this is how it looks on my Corsair:
Code:

231 SSD_Life_Left          0x0013  100  100  010    Pre-fail  Always      -      0
I use ext4 (the Corsair one has also a NTFS partition with Windows 7 installed), but think about switching to either JFS or F2FS, benchmarks indicate impressive performance on SSDs for both.

Jeebizz 09-11-2013 09:11 PM

I am currently running JFS (default options) - TRIM not available on JFS until linux 3.7 I think? I am very much interested in E2FS , but no option to use that as / (at least not by default during install).

I tried smartctl but doesn't seem to be working for me, had to do smartctl --all but still not quite getting what you got:

Code:

root@slackmachine:/home/slackuser# smartctl --all /dev/sda
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.2.45] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:    Samsung SSD 840 PRO Series
Serial Number:    S12PNEACC48544N
LU WWN Device Id: 5 002538 550148c05
Firmware Version: DXM04B0Q
User Capacity:    128,035,676,160 bytes [128 GB]
Sector Size:      512 bytes logical/physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:  8
ATA Standard is:  ATA-8-ACS revision 4c
Local Time is:    Wed Sep 11 21:09:38 2013 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)        Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (  0)        The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (65476) seconds.
Offline data collection
capabilities:                          (0x53) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003)        Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01)        Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:          (  2) minutes.
Extended self-test routine
recommended polling time:          (  15) minutes.
SCT capabilities:                (0x003d)        SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct  0x0033  100  100  010    Pre-fail  Always      -      0
  9 Power_On_Hours          0x0032  099  099  000    Old_age  Always      -      4057
 12 Power_Cycle_Count      0x0032  099  099  000    Old_age  Always      -      45
177 Wear_Leveling_Count    0x0013  099  099  000    Pre-fail  Always      -      2
179 Used_Rsvd_Blk_Cnt_Tot  0x0013  100  100  010    Pre-fail  Always      -      0
181 Program_Fail_Cnt_Total  0x0032  100  100  010    Old_age  Always      -      0
182 Erase_Fail_Count_Total  0x0032  100  100  010    Old_age  Always      -      0
183 Runtime_Bad_Block      0x0013  100  100  010    Pre-fail  Always      -      0
187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      0
190 Airflow_Temperature_Cel 0x0032  068  046  000    Old_age  Always      -      32
195 Hardware_ECC_Recovered  0x001a  200  200  000    Old_age  Always      -      0
199 UDMA_CRC_Error_Count    0x003e  100  100  000    Old_age  Always      -      0
235 Unknown_Attribute      0x0012  099  099  000    Old_age  Always      -      42
241 Total_LBAs_Written      0x0032  099  099  000    Old_age  Always      -      374913542

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
  255        0    65535  Read_scanning was never started
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

root@slackmachine:/home/slackuser#


TobiSGD 09-11-2013 09:18 PM

If I would have to guess I would think that this might be your indicator:
Code:

177 Wear_Leveling_Count    0x0013  099  099  000    Pre-fail  Always      -      2
But I am not sure about that, I have no Samsung drive to compare.

Jeebizz 09-11-2013 09:27 PM

Seems right, the prefail is set to 2 , but I have no idea if thats a higher threshold before failure and if there is any way to change it, so I probably will just leave everything as is. As posted earlier, hopefully E2FS will be included as an option for install one day. Don't get me wrong I love JFS, but clearly it is was never originally designed for an SSD, E2FS is from the ground up taken into consideration flash devices. I will always use JFS or XFS as a conventional FS on a conventional drive though.

H_TeXMeX_H 09-12-2013 07:02 AM

Upgrade your kernel so it supports TRIM for JFS and the problem is solved.


All times are GMT -5. The time now is 11:23 PM.