[SOLVED] Diagnosing Slow Boot and Window Manager Load Times
Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Diagnosing Slow Boot and Window Manager Load Times
Perhaps also, slow program load times, but I'm not so sure on that.
The base issue: I have a computer that takes forever to get from GRUB to login. It also takes forever from login to a usable KDE desktop. What do I mean by forever? I'm talking minutes for each of these. Like turn on the computer, go make some tea. Log in, then tea will be ready to drink.
I have a HUNCH that it's the hard drive on which / resides. Because here are my system specs:
Processor: AMD 8 Core processor < 1 year old
MOBO: Also < 1 year old
RAM: 8 GB
Graphics Card: Brand new (like this week) Nvidia 750Ti
Two monitors - one on DVI and one on HDMI
Hard drives: 110 GB capacity with about 57GB full for /
3TB with about 2.7 TB full for /home
new (as of this week) 3TB with about 2TB full for /home/pictures
Other things that make me think it's the hard drive:
When the System is loading up, hard drive light is STUCK on, but remains that way even after the login screen (KDM) is loaded
Even after KDE desktop is loaded and I can start clicking on stuff, the hard drive light remains STUCK on and things are very slow to load (say, the KDE version of the Start button, for example)
Once the hard drive light is no longer stuck on, the system runs reasonably responsibly
So, I'm trying to think of what I can do to diagnose this and figure out if I need a new hard drive. Did a bit of Googling and searching here on LQ. Looks like I need to look at:
smartctl -a /dev/sd? to see if there are SMART issues
hdparm /dev/sda
Make sure BIOS is set to AHCI
Is there anything else I could do short of installing onto a new hard drive that would help me figure out if the hard drive is the issue?
I'm not sure if this will be helpful (or if linking to other sites like this is allowed on this forum) but check out this, it has information relating to what you are asking about. http://superuser.com/questions/17119...f-a-hard-drive
I'm not sure if this will be helpful (or if linking to other sites like this is allowed on this forum) but check out this, it has information relating to what you are asking about. http://superuser.com/questions/17119...f-a-hard-drive
It certainly goes along with the stuff I'm trying to work with. But I'm not sure it's failing. I'm mostly curious if there's something I'm forgetting. Like something in DMESG that would tell me why it's going so slowly.
It is. Thank you for that info! I will post results here. When do I want to run it? Right after boot? After arriving at Window Manager? I know at least part of the boot time is from mounting NFS shares, but nearly all my Linux computers at home mount NFS shares and don't take nearly this long.
Ok, it looks like I MIGHT have multiple things going on here. Hopefully @Head on a Stick and others can help me properly parse this. Let's start with the hard drive stuff.
So my interpretation here is that it's a SATA II drive and it seems to do fine on this test. Let's check out SMART output. I saw there's a command that can take hours and hours to run. I just went with this one:
Code:
# smartctl -a /dev/sda
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.0.8-300.fc22.x86_64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.10
Device Model: ST3160815AS
Serial Number: 6RA313Q3
Firmware Version: 3.AAD
User Capacity: 160,041,885,696 bytes [160 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA/ATAPI-7 (minor revision not indicated)
Local Time is: Thu Jul 30 17:49:06 2015 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 430) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 54) minutes.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 109 099 006 Pre-fail Always - 0
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 099 099 020 Old_age Always - 1809
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 089 060 030 Pre-fail Always - 878752509
9 Power_On_Hours 0x0032 062 062 000 Old_age Always - 33741
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1837
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 053 048 045 Old_age Always - 47 (Min/Max 40/47)
194 Temperature_Celsius 0x0022 047 052 000 Old_age Always - 47 (0 16 0 0 0)
195 Hardware_ECC_Recovered 0x001a 087 073 000 Old_age Always - 119456045
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
From what I've read, the Old_age is nothing to worry about. Pre-Fail, on the other hand, is NOT a good thing. So it looks like I should PROBABLY replace this disk. But any advice on how seriously to take these warnings would be good. I'm particularly worried about the Reallocated_Sector_Ct as I think that means it's failing more and more. The read error rate can't be good, either. (Edit: Actually I looked online and you want to be HIGHER than the threshold, so I guess that's not the issue) But....let's also move on to systemd-analyze. Because I'm thinking PERHAPS there's ALSO something else going on. But, again, perhaps you guys can help me figure out if it's related to SMART or not.
Code:
# systemd-analyze
Startup finished in 1.389s (kernel) + 11.939s (initrd) + 2min 36.838s (userspace) = 2min 50.166s
==================================
# systemd-analyze blame
1min 25.354s initial-setup-graphical.service
47.561s abrtd.service
42.233s dev-sda3.device
37.338s home.mount
27.617s firewalld.service
21.944s akmods.service
21.569s smb.service
18.883s NetworkManager-wait-online.service
18.856s systemd-journal-flush.service
17.703s NetworkManager.service
16.135s systemd-udevd.service
12.757s libvirtd.service
12.379s cups.service
9.896s plymouth-start.service
8.033s nfs-idmapd.service
7.771s console-kit-log-system-start.service
7.769s vmware.service
7.769s gssproxy.service
7.767s mcelog.service
7.421s rpcbind.service
7.419s netcf-transaction.service
6.405s rsyslog.service
6.114s media-Photos.mount
5.884s nfs-server.service
5.827s colord.service
5.089s proc-fs-nfsd.mount
4.535s dnf-makecache.service
4.057s vmware-USBArbitrator.service
3.940s fedora-loadmodules.service
3.744s systemd-binfmt.service
3.597s media-nfs-xbmc\x2dmount.mount
==================================
# systemd-analyze critical-chain
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.
graphical.target @2min 36.827s
└─multi-user.target @2min 36.827s
└─smb.service @2min 15.256s +21.569s
└─network.target @2min 15.199s
└─NetworkManager.service @1min 38.605s +17.703s
└─firewalld.service @1min 10.985s +27.617s
└─basic.target @1min 10.598s
└─sockets.target @1min 10.598s
└─cups.socket @1min 10.598s
└─sysinit.target @1min 10.534s
└─systemd-update-utmp.service @1min 10.313s +220ms
└─systemd-tmpfiles-setup.service @1min 9.938s +370ms
└─fedora-import-state.service @1min 9.008s +926ms
└─local-fs.target @1min 9.005s
└─home.mount @31.665s +37.338s
└─systemd-fsck@dev-disk-by\x2duuid-89cfd56a\x2d06c7\x2d4805\x2d9526\x2d7be4d24a2872.service @31.306s +88ms
└─dev-disk-by\x2duuid-89cfd56a\x2d06c7\x2d4805\x2d9526\x2d7be4d24a2872.device @31.306s
As you can see, I wasn't exaggerating about it taking minutes to load. Interestingly enough it appears to be related to my /home drive. That seems to account for the largest chunk. Is it trying to run an fsck? Beyond that it appears that smb service and firewalld are the biggest culprits. If smb.service is used both for accessing and providing access, then I need to leave it in. If it's only for Windows to connect to this computer, then I can turn it off.
SO, what do you guys think is going on? One issue? Two issues? More?
As of right now, I'm guessing it has to do with the type of hard drive I bought. When I went back and looked at the technology it's running which, among other things, has the drive modulating between 5400 and 7200 RPMs, I'm inclined to think perhaps it is my home drive more than anything. At any rate, this has certainly taught me a lot about systemd-analyze, smartctl, and hdparm. I'm not going to mark the thread as solved just yet as perhaps there may be something I haven't thought of, but I'll probably mark it as solved by next week if I haven't gained any new ideas.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.