LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 07-22-2018, 04:34 PM   #1
ArazelEternal
Member
 
Registered: Apr 2018
Location: South Central Wisconsin, USA
Distribution: Fedora 28 Workstation - Cinnamon Desktop Spin
Posts: 48

Rep: Reputation: Disabled
Fedora 28 Cinnamon Locks Solid w/ High Second Disk Usage


Hello all,

I am having an issue where my Fedora system locks up solid when I do any high disk usage (copying large files) on my secondary disk, sda1. It can happen within minutes, or it can happen after 45 minutes. There seems to be no discernible pattern.

My kernel version is:

Code:
4.17.6-200.fc28.x86_64
I ran smartctl -a /dev/sda1 on the drive, and this is what its coming up with:

Code:
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.17.6-200.fc28.x86_64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate FireCuda 2.5
Device Model:     ST1000LX015-1U7172
Serial Number:    WES3W1P4
LU WWN Device Id: 5 000c50 0a87e74e1
Firmware Version: SDM1
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Jul 22 16:23:21 2018 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 246)	Self-test routine in progress...
					60% of test remaining.
Total time to complete Offline 
data collection: 		(    0) seconds.
Offline data collection
capabilities: 			 (0x71) SMART execute Offline immediate.
					No Auto Offline data collection support.
					Suspend Offline collection upon new
					command.
					No Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 ( 162) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   077   064   006    Pre-fail  Always       -       52440881
  3 Spin_Up_Time            0x0003   099   099   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   090   090   020    Old_age   Always       -       10329
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   080   060   045    Pre-fail  Always       -       89407760
  9 Power_On_Hours          0x0032   095   095   000    Old_age   Always       -       5222 (87 16 0)
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       1013
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   098   000    Old_age   Always       -       29
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   055   045   040    Old_age   Always       -       45 (Min/Max 33/45)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       17
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       77
193 Load_Cycle_Count        0x0032   092   092   000    Old_age   Always       -       17014
194 Temperature_Celsius     0x0022   045   055   000    Old_age   Always       -       45 (0 3 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       4965h+23m+04.388s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       42182752850
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       26064809572
254 Free_Fall_Sensor        0x0032   001   001   000    Old_age   Always       -       1

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Self-test routine in progress 60%      5222         -
# 2  Extended offline    Completed: read failure       90%      4794         162619880
# 3  Short offline       Completed without error       00%      2239         -
# 4  Short offline       Completed without error       00%      2231         -
# 5  Short offline       Completed without error       00%      1215         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Im running a extended offline self test on it, and it seems to be taking forever. To me, some of those SMART values look like they are not good, but Im not 100% certain. I have a feeling that the drive is failing.

I first noticed the issue whenever I would run my Windows 7 x64 VM which runs off of that drive. IT would lock up solid randomly (mouse would not even move) so the only way to fix it was to do a hard boot. At first I thought it was maybe a file system issue, so I tried copying all the data off to a external disk. It would lock up during this time as well. THe only way I was able to copy all the data was to boot from a Fedora Live DVD. It copied the data fine this way, no lock ups or errors showed up.

So what I am wondering is if this is my HDD failing, or is it a OS issue of some sort?

Any ideas?
 
Old 07-22-2018, 09:15 PM   #2
ArazelEternal
Member
 
Registered: Apr 2018
Location: South Central Wisconsin, USA
Distribution: Fedora 28 Workstation - Cinnamon Desktop Spin
Posts: 48

Original Poster
Rep: Reputation: Disabled
Update:

Extended self test completed without error.
 
Old 07-22-2018, 09:20 PM   #3
mrmazda
LQ Guru
 
Registered: Aug 2016
Location: SE USA
Distribution: openSUSE 24/7; Debian, Knoppix, Mageia, Fedora, others
Posts: 5,808
Blog Entries: 1

Rep: Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066
Seagate makes an .iso you can burn to boot and test to basically tell you whether the drive is OK or failing.

Smartctl -x will provide the pending sector, reallocated sector and other data that essentially say the same thing except for being harder to interpret.
 
Old 07-22-2018, 09:26 PM   #4
mrmazda
LQ Guru
 
Registered: Aug 2016
Location: SE USA
Distribution: openSUSE 24/7; Debian, Knoppix, Mageia, Fedora, others
Posts: 5,808
Blog Entries: 1

Rep: Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066
Quote:
Originally Posted by ArazelEternal View Post
Update:

Extended self test completed without error.
Do you have another cable you can try to connect with? If it is rather old and red it could well be your problem. It is a known problem that certain red dyes once used in SATA cables cause wire corrosion.

/dev/sda1 is a partition. Smart testing should be done on the whole device, /dev/sda.

Can you avoid the lockups by booting the prior kernel?

Last edited by mrmazda; 07-22-2018 at 09:28 PM.
 
Old 07-22-2018, 09:49 PM   #5
ArazelEternal
Member
 
Registered: Apr 2018
Location: South Central Wisconsin, USA
Distribution: Fedora 28 Workstation - Cinnamon Desktop Spin
Posts: 48

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by mrmazda View Post
Do you have another cable you can try to connect with? If it is rather old and red it could well be your problem. It is a known problem that certain red dyes once used in SATA cables cause wire corrosion.

/dev/sda1 is a partition. Smart testing should be done on the whole device, /dev/sda.

Can you avoid the lockups by booting the prior kernel?
Sorry, thats my fault. I neglected to give info on the machine. Its a laptop, a Dell Precision M4800. 4 generations old now, but still kicks butt, especially with linux installed. There is no SATA cable to change. I havent tried a previous kernel yet, however Fedora just pushed a kernel update to
Code:
4.17.7-200.fc28.x86_64
which I am running at the moment. Ill test it out and post my results.
 
Old 07-23-2018, 12:55 AM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,126

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Too many variables - changing kernel in the middle of diagnosis simply adds another.

Cinnamon is a pig - the filemanager (Nemo ?) has several issues. Are you using that (drag and drop) or the terminal for the file copy operations ?. What DE does the DVD use ?. Have you used the DVD for extended periods to replicate your normal usage before the data copy ?.
Basic initial diag.

After that you need to look at memory, CPU consumption, interrupts, ...

Have you played with any mm sysctls ?.

I'm surprised Fedora would assign sda as a secondary disk - let's see this.
Code:
lsblk -f -o +SIZE
 
Old 07-23-2018, 02:58 PM   #7
ArazelEternal
Member
 
Registered: Apr 2018
Location: South Central Wisconsin, USA
Distribution: Fedora 28 Workstation - Cinnamon Desktop Spin
Posts: 48

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by syg00 View Post
Too many variables - changing kernel in the middle of diagnosis simply adds another.

Cinnamon is a pig - the filemanager (Nemo ?) has several issues. Are you using that (drag and drop) or the terminal for the file copy operations ?. What DE does the DVD use ?. Have you used the DVD for extended periods to replicate your normal usage before the data copy ?.
Basic initial diag.

After that you need to look at memory, CPU consumption, interrupts, ...

Have you played with any mm sysctls ?.

I'm surprised Fedora would assign sda as a secondary disk - let's see this.
Code:
lsblk -f -o +SIZE
Cinnamon does indeed use Nemo as the default file manager. That is what I am using to copy the files.

The DVD I loaded as live is the same edition as what is installed - Fedora 28 Cinnamon.

CPU Consumption - From what I have been able to see, it isnt that high. I have a system monitor applet on the cinnamon panel that constantly shows CPU and RAM usage. Niether one is very high either at idle or when transferring the files. That was one thing I payed attention to in the last attempt to copy the files.

Here is the output you asked for:

Code:
[cstayner@localhost ~]$ lsblk -f -o +SIZE
NAME   FSTYPE   LABEL UUID                                   MOUNTPOINT                SIZE
sda                                                                                  931.5G
└─sda1 ext4     Data  b3b3b0ea-987e-40b6-a682-d12f2d0c366d   /mnt/b3b3b0ea-987e-40b6 931.5G
sdb                                                                                  238.5G
├─sdb1 vfat           EF5C-9290                              /boot/efi                 200M
├─sdb2 ext4           16d7e866-5916-4d5b-9afe-27cf3014d202   /boot                       1G
└─sdb3 LVM2_mem       698IjW-AarE-LJxh-XLiF-vrPi-x7tR-y9Ddl1                         237.3G
  ├─fedora_localhost--live-root
  │    ext4           230dc297-1f10-44a4-9e96-aa87a9af249a   /                          50G
  ├─fedora_localhost--live-swap
  │    swap           2874caa2-8b95-4e6c-8975-6e55386ecdd8   [SWAP]                    7.9G
  └─fedora_localhost--live-home
       ext4           330b1dba-64fb-4072-a369-5d56454f7ff1   /home                   179.4G
sr0                                                                                   1024M
 
Old 07-23-2018, 08:39 PM   #8
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,126

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
You may want to keep an eye on this similar thread.
Check your journal for similar messages.
 
Old 07-25-2018, 03:19 AM   #9
AwesomeMachine
LQ Guru
 
Registered: Jan 2005
Location: USA and Italy
Distribution: Debian testing/sid; OpenSuSE; Fedora; Mint
Posts: 5,524

Rep: Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015
Using a file manager to copy a lot of data almost always freezes the machine. Are we talking hundreds of GBs? See if the same thing happens with rsync. And keep in mind thAT Fedora is the testing branch of Red Hat, so it freezes and crashes a lot.
 
Old 07-25-2018, 07:42 PM   #10
ArazelEternal
Member
 
Registered: Apr 2018
Location: South Central Wisconsin, USA
Distribution: Fedora 28 Workstation - Cinnamon Desktop Spin
Posts: 48

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by AwesomeMachine View Post
Using a file manager to copy a lot of data almost always freezes the machine. Are we talking hundreds of GBs? See if the same thing happens with rsync. And keep in mind thAT Fedora is the testing branch of Red Hat, so it freezes and crashes a lot.
Ive known that Fedora was ultimately the testing grounds for Red Hat, however it seems really rather stable for the most part. What drew me to Fedora is its a RPM based distro (the one I know the best, havent done much with DEB based) and its easy to use compared to Red Hat, CentOS, Oracle and others. I tried OpenSUSE for a while but got away from it. Cant remember what I didnt like, but I stopped using it.

Anyway. I was using it to restore a backup of the data drive from a external drive which comes out to about 650 GB (movies, tv shows, anime, games, photos, etc.) I backed it up so I could change it from NTFS to ext4. The freezing started before I did that, and I thought maybe the file system got screwed somewhere and I had been meaning to switch it over to ext4 for a while anyway. That didnt seem to make a difference.

I had my Windows 7 VM going for a few hours the other night after updating to the most recent kernel release and it never locked up. So I dont know if that means its fixed or if it just decided to cooperate that night. Ill try it again to see what happens.

I also checked the disk in question with Seatools bootable, booted from a USB drive. The SMART self-test, the short self-test, and the short generic test (which included inner, outer, and butterfly read) all came back 100% PASS according to the software, so it would seem that the drive is fine. I think it comes down to software.

Ive been having itchings to try another distro, just not sure what yet. Id really like to use like an enterprise level distro (Red Hat, CentOS), but those usually take a lot of config to do what you want and arent (generally) recommended for desktop/laptop use.
 
Old 07-25-2018, 09:09 PM   #11
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,126

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
I have used Fedora constantly for years, and have had few incidents - despite often using Linus' latest rc kernels.
This looks like an upstream bug - hard to blame Fedora. Have you tried reverting to a prior kernel - default is to keep 3 I think, but I usually keep 6-10 due to my testing regime.
 
Old 07-25-2018, 09:18 PM   #12
ArazelEternal
Member
 
Registered: Apr 2018
Location: South Central Wisconsin, USA
Distribution: Fedora 28 Workstation - Cinnamon Desktop Spin
Posts: 48

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by syg00 View Post
I have used Fedora constantly for years, and have had few incidents - despite often using Linus' latest rc kernels.
This looks like an upstream bug - hard to blame Fedora. Have you tried reverting to a prior kernel - default is to keep 3 I think, but I usually keep 6-10 due to my testing regime.
I havent tried a previous kernel yet. My system is set to the default for keeping kernels, so it lists 3 in the grub menu, plus the recovery. I would assume 3 is more than enough for most desktop user purposes? Or should I change it more/less?
 
Old 07-25-2018, 11:19 PM   #13
mrmazda
LQ Guru
 
Registered: Aug 2016
Location: SE USA
Distribution: openSUSE 24/7; Debian, Knoppix, Mageia, Fedora, others
Posts: 5,808
Blog Entries: 1

Rep: Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066Reputation: 2066
Quote:
Originally Posted by ArazelEternal View Post
I would assume 3 is more than enough for most desktop user purposes? Or should I change it more/less?
One never knows unless and until those present all fail. Fedora wastes little time pushing latest kernel versions into its release version repos. Keeping many if you have the space can assist in bisecting for regression windows.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
vmdk high disk usage on centos skoda Linux - Newbie 5 08-24-2016 10:53 PM
High %wa , but iotop shows low disk usage postcd Linux - General 2 05-03-2014 11:40 AM
hard disk memory usage weirldy high nikooo777 Linux - Security 2 04-02-2010 02:33 PM
[SOLVED] High CPU usage during disk IO? Is this normal? epsilon72 Linux - Hardware 5 02-24-2010 07:58 AM
Lots of disk usage for no reason, system locks up gimpy530 Ubuntu 16 04-15-2008 05:44 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 04:20 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration