LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 05-04-2018, 02:43 AM   #1
awabimakoto
LQ Newbie
 
Registered: Aug 2012
Distribution: slackware
Posts: 4

Rep: Reputation: 1
getsockopt() results in D state after upgrading to kernel 4.4.118


Hello Slackers and all LQ folks,

I use slackware 14.2 and I am the maintainer of shadowsocks-libev in SlackBuilds.org. After upgrading to kernel 4.4.118 the code I found that the command ss-redir from package shadowsocks-libev will result in uninterruptible sleep (D state) after first connection request. ss-redir is a transparent proxy. I also tried to compile redsocks which does a similar job, and the problem persists. Invoking strace ss-redir will throw out the results below.
Code:
epoll_wait(3, [{EPOLLIN, {u32=5, u64=4294967301}}], 64, 59743) = 1
accept(5, NULL, NULL)                   = 7
getsockopt(7, SOL_IPV6, 0x50 /* IPV6_??? */, 0x7fff6a586170,
0x7fff6a58616c) = -1 EOPNOTSUPP (Operation not supported)
getsockopt(7, SOL_IP, 0x50 /* IP_??? */,
I refered to source code of shadowsocks-libev. The bug is from line 112 from the filehttps://github.com/shadowsocks/shado...78/src/redir.c

As the same version of ss-redir works well when I was using kernel 4.4.115, I think the problem should be in the kernel. When there is a line likes
Code:
getsockopt(fd, SOL_IP, SO_ORIGINAL_DST)
the D state happens.

Some people already submitted this bug report to:
https://bugzilla.kernel.org/show_bug.cgi?id=198791

And it seems that the solution is the patch by Paolo Abeni
<pabeni@redhat.com>, which has been submitted to kernel 4.4.119 (commit
482526ec0ad07de8cc6a4a2e9376057e83e118c9) and 4.15.7 (commit
d7ef969797fdeeb12a3afe069d86d1eaf037ac71).

As the issue may affect any packages that uses getsockopt function (openvpn maybe? I haven't tested), I wonder if we can upgrade kernel version of slackware 14.2 at least to 4.4.119?
 
Old 05-05-2018, 07:31 PM   #2
FlinchX
Member
 
Registered: Nov 2017
Distribution: Slackware Linux
Posts: 225

Rep: Reputation: Disabled
Perhaps related https://www.linuxquestions.org/quest...or-4175624554/

Update: I'm getting a very similar strace log for tor, see the recent post in the link above

Last edited by FlinchX; 05-07-2018 at 03:14 PM.
 
1 members found this post helpful.
Old 05-13-2018, 01:24 PM   #3
awabimakoto
LQ Newbie
 
Registered: Aug 2012
Distribution: slackware
Posts: 4

Original Poster
Rep: Reputation: 1
Thanks for your information, FinchX. I used nm command with -D parameter to check the symbols of some binaries which does transparent proxy, and I found that ss-redir, tor, redsocks and openvpn have the symbol. I haven't actually tested openvpn but I assume it will not work well with kernel 4.4.118.
I also found the issue in shadowsocks-libev github page.

https://github.com/shadowsocks/shado...ev/issues/1955

The problem exists in kernel 4.15.6 but gets fixed in 4.15.7. Now I'm pretty sure the essential patch for this problem is the one below.
Quote:
Author: Paolo Abeni <pabeni@redhat.com>
Date: Thu Feb 8 12:19:00 2018 +0100

netfilter: drop outermost socket lock in getsockopt()

commit 01ea306f2ac2baff98d472da719193e738759d93 upstream.

The Syzbot reported a possible deadlock in the netfilter area caused by
rtnl lock, xt lock and socket lock being acquired with a different order
on different code paths, leading to the following backtrace:
Reviewed-by: Xin Long <lucien.xin@gmail.com>

======================================================
WARNING: possible circular locking dependency detected
4.15.0+ #301 Not tainted
------------------------------------------------------
syzkaller233489/4179 is trying to acquire lock:
(rtnl_mutex){+.+.}, at: [<0000000048e996fd>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74

but task is already holding lock:
(&xt[i].mutex){+.+.}, at: [<00000000328553a2>]
xt_find_table_lock+0x3e/0x3e0 net/netfilter/x_tables.c:1041

which lock already depends on the new lock.
===

Since commit 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), we already acquire the socket lock in
the innermost scope, where needed. In such commit I forgot to remove
the outer-most socket lock from the getsockopt() path, this commit
addresses the issues dropping it now.

v1 -> v2: fix bad subj, added relavant 'fixes' tag

Fixes: 22265a5c3c10 ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
Fixes: 202f59afd441 ("netfilter: ipt_CLUSTERIP: do not hold dev")
Fixes: 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock only in the required scope")
Reported-by: syzbot+ddde1c7b7ff7442d7f2d@syzkaller.appspotmail.com
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After inspecting the kernel changelog, I think updating at least to 4.4.119, 4.14.23, 4.15.7 will solve the problem. 4.15.x is EOL. So I suggest we may have a minor kernel update (still 4.4.x) for slackware 14.2 if -current is not going to be a release in near future.
 
Old 05-23-2018, 03:31 AM   #4
awabimakoto
LQ Newbie
 
Registered: Aug 2012
Distribution: slackware
Posts: 4

Original Poster
Rep: Reputation: 1
Today the kernel bumped to 4.4.132 and the problem got resolved.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
slackware14.2 stable x86_64 kernel 4.4.118 > 4.4.14 MAKiNA Slackware 7 03-27-2018 12:11 PM
erreur unknown state 118 in cluster adel87 Linux - Server 0 04-28-2015 01:44 PM
getsockopt returns 111 as error code when invoked after epoll_wait call msubrahmanya Linux - Kernel 0 09-06-2012 06:38 AM
LXer: The Linux Kernel column #90 the state of the kernel LXer Syndicated Linux News 0 08-05-2010 10:30 PM
Upgrading Sun Ultra 24 BIOS to latest version (1.3) results in a non bootable system crisostomo_enrico Solaris / OpenSolaris 3 12-29-2008 10:39 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 12:12 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration