Linux - KernelThis forum is for all discussion relating to the Linux kernel.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
The behaviour of dm-crypt seems to depend on machine type. Sometimes it is multithreaded, sometimes it is singlethreaded. Even with the same Software.
I have three different PC systems here. Thinkpad X60, T400 and an AMD PC. All are multicore machines (X60: Intel Core Duo, T400: Intel Core 2 Duo, AMD PC: Phenom II X6)
All booted from the same linux live image (via usb flash drive). All accessing the same encrypted external hard drive (LUKS with aes-cbc-essiv:sha256).
I'm watching "top"/"htop" in one terminal while starting a reading thread in another one:
cat /dev/mapper/encrypteddisk > /dev/null
With the T400 there are ~2 active "kworker/n:m"-kernel-threads in the process list. Where n and m are random digits. It seems there are 4 active threads running, but only two are heavily using the two CPU cores.
With the X60 and the AMD PC there's only one active "kworker/n:m"-thread at a time, while n and m are constantly changing. So they are also mutlithreaded, there are indeed multiple "kworker/n:m" threads with low CPU load, but it seems multithreading is not working. There's only one thread actively using high CPU load at a time.
Does anybody know why?
The expected behaviour would be 2 (X60) to 6 (AMD PC) active kworker threads, like with the T400.
With external USB 2.0 media there isn't much speed difference, but with internal hard drive (esp. SSD) the T400 is faster, even if the AMD PC has a more powerful CPU. The T400 CPU load rises to 200% (2 Cores at 100% each) while the other ones remain at 100% (1 Core at 100%). i.e. 100% cpu load for the T400, 50% for the X60 and 17% for the AMD X6.
Btw. this affects only reading. Writing to the Disk rises multiple kworker threads with CPU load on all the machines.
CBC encryption is limited to one thread per block. But every single Filesystemblock can be encrypted and decrypted in parallel. And btw. I said "this affects only reading" (i.e. decrypting). Writing (i.e. encrypting) is multithreaded, so parallelism via blocks seems to be in use.
And btw. I'm using XTS for all those systems which truecrypt is able to encrypt and decrypt via multicore.
And I've used the same drive and Software (Linux/dm-crypt) on all 3 PCs and it is using multiple cores on one machine and a single core on the other two multicore-machines. Seems odd.
interestingly after reactivating the second cpu core the speed doesn't fully recover:
#######
root@pc:~# echo 1 > /sys/devices/system/cpu/cpu1/online
root@pc:~# sync; echo 3 > /proc/sys/vm/drop_caches
root@pc:~# pv /dev/mapper/sda2_crypt > /dev/null
920MB 0:00:10 [95,6MB/s] [> ] 0% ETA 0:21:55
^C
#######
me wonders why it's still faster then with one single core, because htop/top shows only one active kworker and core.
This is with an old FSB-based dual core. A modern CPU with far more CPU cores should produce even better results.
Nevertheless it's far from doubling performance with doubling the number of cpu cores.
With dm-crypt (kernelspace) the X6 also gets ~100 MB/s, with truecrypt (userspace, but multithreaded) it's >400 MB/s!
It seems dm-crypts implementation is less-than-ideal / subpar. The old unbeloved truecrypt is superior in that case.
Yeah could be a kernel issue, you didn't provide very much info about the systems, so I can't tell. Try newer kernels.
More threads does not always equal more performance. I was thinking that this may not be a CPU-bound process, but rather a HDD throughput bound process.
Btw: It's no serious problem; I can live with it since I started using encryption and I'm using encrypted drives for many years now...
At the X6 the CPU-load is ~18%, which is near 100/6 (not by chance). If I switch to the powersave governor (to limit the CPU clock) the CPU-load is still ~18%, right a bit above one active core. But throughput decreases to ~30MB/s. On the naked (unencrypted) device it's ~180 MB/s. Yes, it seems CPU-bound.
I think while starting this thread I was using Ubuntu 12.04 on the T400 and X6. I switched to the new Ubuntu lts late in 2014 but I don't know exactly when. Right now it's Ubuntu 14.04 and Kernel 3.13.0-45-generic #74-Ubuntu SMP. The X60 runs Arch Linux with its i686 lts-kernel (3.14 now and 3.12 in September?). Anyway: I'm monitoring this behaviour for many years now. So this is not limited to a single kernel version. While in past dmcrypt was only single threaded, today it seems multithreaded only in some rare conditions.
I also have a small embedded ARM board running linux 3.18 and debian 7 in userland. It uses LUKS (aes, xts-plain, sha1) and also suffers from only using a single core for decryption (quad core SoC).
The LUKS parameters for the Tests at the PCs are various combinations of aes 128 or 256 with xts-plain64 or cbc-essiv mode and ripemd160 or sha1 hashes. I've copied (via dd and nc) the same small test partition to the internal drives of all three Computers to compare them.
The Truecrypt partition is created by Truecrypt 7 with aes and default parameters (AFAIK also xts mode) and mounted in Linux via either Truecrypt or:
$ dmsetup-tc /boot/TC-header1.img /dev/sdc3 | dmsetup create WinTC
$ mount /dev/mapper/WinTC /mnt/tmp
It's a USB/eSata drive, so it's limited to USB 2.0 speed with the older Thinkpads - this is not relevant for truecrypt vs. dm-crypt comparison at the X6 because it has eSata.
The latter illustrates: it's not depending on encryption parameters and disk layout. Truecrypt uses multiple cores while dm-crypt still uses a single one with the same disk and data partition. (using pv [pipe viewer] on the block device to have a sequential read, but even with ntfs-3g it's cpu-limited by the decrypting kworker)
For me it seems dm-crypt itself is multithreaded. There are several different kworkers involved. But instead of running in parallel like with the T400 they are running serialized at the other machines. Maybe dmcrypt multithreading works only on modern Genuine Intel...
Thanks for your efforts, but maybe I should simply wait (further) for the problem to vanish spontaneously with future upgrades. ;-)
┌──────────────────────── Parallel crypto engine ─────────────────────────┐
│ CONFIG_CRYPTO_PCRYPT: │
│ │
│ This converts an arbitrary crypto algorithm into a parallel │
│ algorithm that executes in kernel threads. │
│ │
│ Symbol: CRYPTO_PCRYPT [=y] │
│ Type : tristate │
│ Prompt: Parallel crypto engine │
│ Location: │
│ -> Cryptographic API (CRYPTO [=y]) │
│ Defined at crypto/Kconfig:136 │
│ Depends on: CRYPTO [=y] && SMP [=y] │
│ Selects: PADATA [=y] && CRYPTO_MANAGER [=y] && CRYPTO_AEAD [=y] │
If you tried different kernels and the problem persists, then the problem is unlikely to go away. It could have to do with the age of the processors. The newer Core 2 Duo threads better than the older Core Duo.
I would try to do some benchmarks locally on an internal drive. Just make a file and encrypt it with various programs and methods and algorithms and see if things change. Technically, newer cryptsetup installs have a benchmark option, but it's not too reliable unfortunately, it can give absurdly high numbers for some algorithms.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.