So, I'm back with this old issue
First of all thanks you guys for your hints (were useful as general information but I'm not able to pinpoint the real issue - not able to understand enough about the kernel & filesystem).
Anyway, I did not find the root cause, but since today I HOPE that I have found some kind of workaround/half-fix.
Posting therefore here the informations I have and marking the thread as solved (will immediately "un-solve" it if the issue starts occurring again).
If you're not interested in the details, my curent fix/workaround is...
Code:
echo 10000 > /proc/sys/vm/dirty_writeback_centisecs
...which overwrites the default value of 500.
Why does this (hopefully) fix my issue? No clue.
Details:
1)
Had this issue (high "system"-type CPU usage by "kworker"-processes) for several months, on both my primary Asus notebook (SSD) and my new secondary Dell XPS 13 notebook (NVMe).
2)
Became aware of the issue because I have all the time gkrellm running and showing CPU usage (and disk/wlan activity, temp, etc...) => saw frequent spikes (once every 10-300 or so seconds) lasting between ~0.2 to ~5 seconds.
3)
Reproducing the problem involves a certain degree of "luck" (with the exact same - mostly idle - workload, sometimes/rarely everything worked perfectly for a while, but most of the time the problem persisted), and to increase chances that the problem occurred I usually start Firefox + Vivaldi (or Chrome) + kdevelop. All these apps write a few bytes to disk at regular intervals even if there is no activity (which is in my opinion stupid as it impacts the battery of the notebooks - I tried but I haven't been able to disable it)
Important: I couldn't reproduce the problem by just writing stuff to disk (e.g. with "dd" or copying files around - actually when I did it the problem vanished for a while) => maybe the problem has more to do with appending to file and/or overwriting or something similar?
4)
Whenever the high CPU usage occur, "iotop -o -b | grep -i kworker" shows something like this...
Code:
TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND
...
3739 be/4 root 0.00 B/s 0.00 B/s 0.00 % 4.08 % [kworker/1:0]
925 be/4 root 0.00 B/s 0.00 B/s 0.00 % 3.50 % [kworker/3:1]
2572 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.54 % [kworker/0:2]
2881 be/4 root 0.00 B/s 0.00 B/s 0.00 % 4.50 % [kworker/2:0]
2572 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.87 % [kworker/0:2]
925 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.54 % [kworker/3:1]
3165 be/4 root 0.00 B/s 9.18 G/s 0.00 % 0.00 % [kworker/u8:2]
2881 be/4 root 0.00 B/s 0.00 B/s 0.00 % 4.03 % [kworker/2:0]
3739 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.46 % [kworker/1:0]
2572 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.44 % [kworker/0:2]
3165 be/4 root 0.00 B/s 10.37 G/s 0.00 % 0.00 % [kworker/u8:2]
2881 be/4 root 0.00 B/s 0.00 B/s 0.00 % 2.88 % [kworker/2:0]
925 be/4 root 0.00 B/s 0.00 B/s 0.00 % 2.18 % [kworker/3:1]
3739 be/4 root 0.00 B/s 0.00 B/s 0.00 % 1.67 % [kworker/1:0]
2572 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.53 % [kworker/0:2]
3165 be/4 root 0.00 B/s 3.62 G/s 0.00 % 0.00 % [kworker/u8:2]
925 be/4 root 0.00 B/s 0.00 B/s 0.00 % 3.88 % [kworker/3:1]
3739 be/4 root 0.00 B/s 0.00 B/s 0.00 % 2.22 % [kworker/1:0]
2881 be/4 root 0.00 B/s 0.00 B/s 0.00 % 1.20 % [kworker/2:0]
...
(in the above case I had 3 CPU-usage spikes)
(the "real" amount of bytes that was written to disk aren't the GBs reported above but 0 or just a few KBs - kind of random but in any case always very very little)
...and "perf record -g -a sleep 3" + "perf report" showed this (
THIS IS NOT A "PERF" OF THE ABOVE "IOTOP"-OUTPUT - it's for another spike, but I got basically always the same informations shown about writeback etc...):
Code:
# Children Self Command Shared Object Symbol
# ........ ........ .............. ................................... ............................................
#
8.06% 8.06% kworker/u16:0 [kernel.vmlinux] [k] radix_tree_next_chunk
|
---radix_tree_next_chunk
6.96% 6.96% kworker/u16:0 [kernel.vmlinux] [k] writeback_sb_inodes
|
---writeback_sb_inodes
6.81% 6.81% kworker/u16:0 [kernel.vmlinux] [k] _raw_spin_lock
|
---_raw_spin_lock
5.86% 5.86% kworker/u16:0 [kernel.vmlinux] [k] write_cache_pages
|
---write_cache_pages
5.72% 5.72% kworker/u16:0 [kernel.vmlinux] [k] dec_zone_page_state
|
---dec_zone_page_state
5.25% 5.25% kworker/u16:0 [kernel.vmlinux] [k] clear_page_dirty_for_io
|
---clear_page_dirty_for_io
4.72% 4.72% kworker/u16:0 [kernel.vmlinux] [k] __mark_inode_dirty
|
---__mark_inode_dirty
5)
Because of the above output & test behaviour I ended up focusing on the filesystem.
As I am using "nilfs2" as rootfs (don't want stop using it because I like too much its features of data-checksum and continuous snapshots to recover stuff I delete by mistake) I tried different versions of its userland tools (between 2.1.5-r1 and 2.2.2 which I'm using now), garbage-collection settings (even if the problem occurred when GC was not running nor active) and both no/discard mount options.
I additionally tried other things - multiple kernel versions (kernels 4.1, 4.3 and 4.6 if I remember correctly), tried both "CONFIG_NO_HZ_IDLE" and "CONFIG_NO_HZ_FULL" timer subsystems, fully upgraded twice my Gentoo OS, played with powersave options fully on and fully off, etc... .
6)
Today, after executing "echo 10000 > /proc/sys/vm/dirty_writeback_centisecs" everything became quiet and I had 1 small CPU spike ~45 minutes ago and that's it.
No particular reason why I chose "10000" - was impulsive.
And as well no particular reason why I decided to fiddle around with "dirty_writeback_centisecs" - was again impulsive, indirectly pushed by the above "perf"-output listing stuff related to dirty & writeback...

.
Cheers