LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > SUSE / openSUSE
User Name
Password
SUSE / openSUSE This Forum is for the discussion of Suse Linux.

Notices


Reply
  Search this Thread
Old 09-12-2016, 01:02 AM   #1
roo47
LQ Newbie
 
Registered: Sep 2016
Location: Bowen Mountain/Australia
Distribution: openSUSE 13.1 for all desktop work
Posts: 6

Rep: Reputation: Disabled
openSUSE 13.1 has several non-interruptible 2 to 7 second delays every minute


I have been doing all my development and desktop work with SUSE Linux for years and never had serious problems. 3 years ago I switched to openSUSE 13.1 (Bottle) (x86_64) using Linux 3.11.10-29-desktop and KDE 4.11.5. My hardware is a COMPAQ Presario CQ57 Laptop with a dual processor Intel(R) Pentium(R) CPU B940 @ 2.00GHz with 4 GByte of memory and a 300 GByte SATA disk, half of which is used for Linux. The root partition is 25 GByte (40% used). The Home partition is 117 GByte (14% used). 150 GByte is Windows XP, which I never (and now no longer can) use. I am loath to switch to Leap 42.1 or later because I am in the final stages of a large project and do not want the hassle of setting up and configuring a new environment. Also the reports on Leap 42.1 are not encouraging. The SUSE Software updater reports "Your system is up to date".

About 6 months ago I started noticing delays in echoing keystrokes of about 1 second. Over the months this has gradually increased to the current delays of up to 7 seconds. The delays are most noticeable with keystrokes, but they also slow down switching tabs on the Konsole or switching to another program. This makes working with the system very arduous and tacky.

In an attempt to locate where the delays are coming from I first noticed that the monitor program "xosview" showed frequent blocks of "WIO" activity on both CPU's, which roughly coincide with the keyboard and other blockages I was noticing. On http://www.chileoffshore.com/en/interesting-articles/126-linux-wait-io-problem> in an article Linux Wait IO Problem. The author (chile) points out: The main cause are those background processes with "D" status code which means "Uninterruptible sleep". Later he points out that the ext4 journal processes (jbd2) are the most likely culprits. This proved to be the case on my system, which I could nail by running the following script:

while true; do ps auxf | grep D | \
if grep -E "(jbd2\/sda\.*|kdmflush)"; then \
date; \
fi; sleep 1; \
done | tee jbd_21060912.log

The following is the output of jbd_21060912.log (with 3 columns removed and consecutive delays marked manually)

root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
Mon Sep 12 11:20:10 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
Mon Sep 12 11:20:11 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
Mon Sep 12 11:20:12 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
Mon Sep 12 11:20:13 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
Mon Sep 12 11:20:14 AEST 2016 = 5 seconds
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:20:49 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:20:50 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:20:51 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:20:52 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:20:53 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:21:54 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:21:55 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:21:56 AEST 2016 = 8 seconds
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:22:14 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:22:15 AEST 2016 = 2 seconds
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:22:52 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:22:53 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:22:54 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:22:55 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:22:56 AEST 2016 = 5 seconds
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
Mon Sep 12 11:23:09 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
Mon Sep 12 11:23:10 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
Mon Sep 12 11:23:11 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
Mon Sep 12 11:23:12 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
Mon Sep 12 11:23:13 AEST 2016 = 5 seconds
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:23:22 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:23:23 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:23:24 AEST 2016
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:23:25 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
Mon Sep 12 11:23:26 AEST 2016 = 5 seconds
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:23:38 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:23:39 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:23:40 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:23:41 AEST 2016
root 218 0.0 0.0 0 D \_ [jbd2/sda6-8]
root 416 0.0 0.0 0 D \_ [jbd2/sda7-8]
Mon Sep 12 11:23:42 AEST 2016 = 5 seconds on both CPU's !!

7 delays of 5 seconds on average in 3 minutes 32 seconds. This happens all the time. I chose the beginning of a run done just now with only Firefox running. The output is similar with no processes of mine running at all.

The solution is not simple.
(chile) points out: the reason of high WA is not always the same. But the solution will always on those processes which are with STAT as D. In this case, the configuration of "Journal Disk" should be reconsidered. If the server is a machine for development, it is not recommended to use Journal to protect the hard disk.

The problem I can see is, that reconfiguring Journalling can only be done by re-formatting the disk, which I definitely do not want to do - I have a lot of work on that disk and need to work on my project.

So my Linux Question is: what can I do to pin down and eradicate this continuous disc activity (most probably journalling - what!) and bring my system back to normal?

PS: I have not recently done a hardware test on the disk - any suggestions on the best non-destructive disk test would be most welcome. I did run Memtest 86+ v4.28 last night without error.

PPS: It could be that "xosview" is the culprit, although I have switched it off a few weeks ago and it has not made any difference.
 
Old 09-12-2016, 01:58 AM   #2
John VV
LQ Muse
 
Registered: Aug 2005
Location: A2 area Mi.
Posts: 17,627

Rep: Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651
Quote:
Also the reports on Leap 42.1 are not encouraging.
i like 42.1

also be advised 13.1 whet END OF LIFE 8 months ago ( Feb. 3 - 2016)
https://en.opensuse.org/Lifetime
Quote:
The SUSE Software updater reports "Your system is up to date".
there have been no updates for the last 8 months so you are NOT up to date

as to the disk ?
a guess

the sectors were created mis aligned ?

pop in gparted live cd and have a look
" fdisk -l " also will tell you


but seeing as 13.1 is past EOL and 13.2 WILL be soon

upgrade to 13.2 then in feb or may of 2017 ( in about 6 months) upgrade AGAIN to 41.1 or 41.2
 
1 members found this post helpful.
Old 09-12-2016, 02:35 AM   #3
roo47
LQ Newbie
 
Registered: Sep 2016
Location: Bowen Mountain/Australia
Distribution: openSUSE 13.1 for all desktop work
Posts: 6

Original Poster
Rep: Reputation: Disabled
Thanks for the quick reply. I was not aware of the 13.1 end of life issue - Guess I have to bite the bullet and upgrade. As far as misaligned sectors are concerned I will check with 'gparted' now.
 
Old 09-12-2016, 03:33 AM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,140

Rep: Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123
Some thoughts/questions:
- if it worked ok 6 months ago, why would you think journalling is now the problem ?.
- tasks in "D" are generally victims (they're waiting for disk I/O completion), not cause. The author you quote does not understand task in uninterruptible sleep - they certainly do not consume CPU.
- your loop is very crude, try installing and running latencytop.
- F/F is a pig, and has been worse recently. I ran some kernel function traces on ext4 and F/F was the predominant caller. Try the following (as root) and see if it has any benefit
Code:
echo 1500 | sudo tee /proc/sys/vm/dirty_writeback_centisecs
This will slow down the rate at which I/O is forced out to disk (use 500 to set it back to default).
- run smartctl on your disk - you may be able to get to it from "disk" or similar in openSUSE.
 
1 members found this post helpful.
Old 09-12-2016, 03:58 AM   #5
roo47
LQ Newbie
 
Registered: Sep 2016
Location: Bowen Mountain/Australia
Distribution: openSUSE 13.1 for all desktop work
Posts: 6

Original Poster
Rep: Reputation: Disabled
I have run gparted. Results are as follows:

/dev/sda5 326,387,712 / 4096 = 74,684.5 swap
/dev/sda6 334,778,368 / 4096 = 81,733 / root
/dev/sda7 385,884,160 / 4096 = 81,733 /home

apart from "swap" both Linux partitions are aligned on the 4096 boundary.
For what it is worth the two NTFS partitions for Windows are also aligned.

My concern is, why did this 13.1 distribution run flawlessly (as far as delays are concerned) for 2 1/2 years and then deteriorate in the way described after the system reaches its official "End of Life". One cannot help feeling that something was planted to force users to change. But why? I thought this only happened with closed shop software.
 
Old 09-12-2016, 05:55 AM   #6
roo47
LQ Newbie
 
Registered: Sep 2016
Location: Bowen Mountain/Australia
Distribution: openSUSE 13.1 for all desktop work
Posts: 6

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by syg00 View Post
- if it worked ok 6 months ago, why would you think journalling is now the problem ?.
- tasks in "D" are generally victims (they're waiting for disk I/O completion), not cause.
Because there are real delays - the keyboard is fully blocked for up to 7 seconds - several times a minute. To my view as an engineer that is a sure sign of an uninterruptible sleep. The primary source of the problem is probably FireFox. In fact I switched off FireFox altogether for an hour while having dinner. There were only 80 D events in that hour. (Previously 205 in 3 1/2 minutes = 3,600/hour with FireFox)

Quote:
Try the following (as root) and see if it has any benefit
Code:
echo 1500 | sudo tee /proc/sys/vm/dirty_writeback_centisecs
This will slow down the rate at which I/O is forced out to disk (use 500 to set it back to default).
I have run your code snippet and then logged with my crude script for the last 20 minutes while editing this reply - it has only caused 24 D events = 72/hour. Definitely a vast improvement over 3,600/hour. Also no noticeable delays in typing. Thanks.
 
Old 09-12-2016, 06:07 AM   #7
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,140

Rep: Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123
I'd be worried about that disk - smartctl will confirm that or not.
Me, I'd get a full backup and a new disk. Just in case.
 
Old 09-12-2016, 05:17 PM   #8
roo47
LQ Newbie
 
Registered: Sep 2016
Location: Bowen Mountain/Australia
Distribution: openSUSE 13.1 for all desktop work
Posts: 6

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by syg00 View Post
I'd be worried about that disk - smartctl will confirm that or not.
Me, I'd get a full backup and a new disk. Just in case.
You are right - that's what I will do and report when I am done. Will take a few days. Thanks
 
Old 09-13-2016, 02:27 PM   #9
wagscat123
Member
 
Registered: Jan 2009
Location: Maryland-Pennsylvania border, USA
Distribution: openSUSE 15.2/15.3, Tumbleweed, Kubuntu 18.04/21.04, macOS 10.15, antiX 19, and Linux Mint 19.3
Posts: 860
Blog Entries: 45

Rep: Reputation: 120Reputation: 120
Just as a heads up on 13.1's EOL - it is now supported by Evergreen, so I think it's now such that you just still receive updates, just from the Evergreen community, rather than from SUSE. Just keep updating as you did before, although beware that the Evergreen support end will in another 2 months.
 
Old 09-18-2016, 01:55 AM   #10
roo47
LQ Newbie
 
Registered: Sep 2016
Location: Bowen Mountain/Australia
Distribution: openSUSE 13.1 for all desktop work
Posts: 6

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by syg00 View Post
I'd be worried about that disk - smartctl will confirm that or not.
Me, I'd get a full backup and a new disk. Just in case.
Have installed openSUSE Leap42.1 on a new 250 GB SSD disk, after backing up the 2 Linux partitions and the Windows partition as tar balls. Restored all important data from those 3 backups to Leap42.1, abandoning Windows7 in the process. Leap42.1 running very smoothly. Ran my test script: while true; do ps auxf | grep D etc.
There was one D event in the next hour:
root 292 0.0 0.0 0 0 ? D 15:37 0:00 \_ [jbd2/sda2-8]
I guess one journalling event per hour is reasonable.

Last edited by roo47; 09-18-2016 at 01:59 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
crontab: minute hour or hour minute anon091 Linux - Newbie 2 11-04-2009 03:09 PM
LXer: DistroWatch Weekly: openSUSE 10.2, Debian delays, Mandriva updates, Pioneer Linux LXer Syndicated Linux News 0 12-11-2006 04:21 AM
DNS delays... jademan83 Linux - Networking 1 01-01-2006 01:52 PM
SSH delays sachinh Linux - General 1 08-17-2005 10:00 AM
audio delays... [ITA]freeware Linux - Software 3 04-14-2005 05:47 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > SUSE / openSUSE

All times are GMT -5. The time now is 05:06 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration