LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   WiFi loss upon resume? (https://www.linuxquestions.org/questions/slackware-14/wifi-loss-upon-resume-4175700040/)

pghvlaans 09-01-2021 10:55 AM

WiFi loss upon resume?
 
Since the latest round of upgrades (Tue Aug 31 20:58:13 UTC 2021), my laptop (Intel wireless controller AX200) has been losing connectivity after closing the lid and re-opening. It's been running -current since January, and I haven't seen this issue before.

Basically, after opening the lid, a process called "dhcpcd [Network Proxy]" didn't reappear, and all subsequent pings returned "temporary name resolution failure." This happened both under X and in tty.

I tried a number of things, including a firmware rollback, using the standard kernel and rebuilding dhcpcd, but the only working solution was to remove the dhcpcd user and group. Now the connection seems to be behaving normally. Has anyone else had this issue since the etc package upgrade?

ctrlaltca 09-01-2021 11:09 AM

I am experiencing the same issue.

After a resume i see the following processes running:
Code:

dhcpcd    2993  0.0  0.0  3080  2580 ?        S    09:26  0:00 dhcpcd: wlan0 [ip4]
root      2998  0.0  0.0  3004  1968 ?        S    09:26  0:00 dhcpcd: [privileged actioneer] wlan0 [ip4]
dhcpcd    2999  0.0  0.0  2992  288 ?        S    09:26  0:00 dhcpcd: [control proxy] wlan0 [ip4]
dhcpcd  10199  0.0  0.0  3004  308 ?        S    18:02  0:00 dhcpcd: [BPF ARP] wlan0 192.168.1.104
dhcpcd  10200  0.0  0.0  3004  308 ?        S    18:02  0:00 dhcpcd: [BPF BOOTP] wlan0

Some of them are started at 09:26 (bootup time), some others at 18:02 (resume time).

The following messages are printed in syslog:
Code:

Sep  1 18:02:14 sprawl NetworkManager[1398]: <warn>  [1630512134.1552] device (wlan0): Activation: failed for connection 'offinf'
Sep  1 18:02:14 sprawl NetworkManager[1398]: <warn>  [1630512134.1587] dhcp-listener: dhcp-event: (pid 2993) unhandled DHCP event for interface wlan0
Sep  1 18:02:14 sprawl NetworkManager[1398]: <warn>  [1630512134.1621] dhcp-listener: dhcp-event: (pid 2993) unhandled DHCP event for interface wlan0
Sep  1 18:02:14 sprawl NetworkManager[1398]: <warn>  [1630512134.1679] dhcp-listener: dhcp-event: (pid 2993) unhandled DHCP event for interface wlan0
Sep  1 18:02:16 sprawl NetworkManager[1398]: <warn>  [1630512136.8853] dhcp-listener: dhcp-event: (pid 2993) unhandled DHCP event for interface wlan0
Sep  1 18:02:16 sprawl NetworkManager[1398]: <warn>  [1630512136.8905] dhcp-listener: dhcp-event: (pid 2993) unhandled DHCP event for interface wlan0
Sep  1 18:02:16 sprawl NetworkManager[1398]: <warn>  [1630512136.8956] dhcp-listener: dhcp-event: (pid 2993) unhandled DHCP event for interface wlan0
Sep  1 18:02:19 sprawl dhcpcd[2998]: ps_ctl_dispatch: cannot handle another client

EDIT: the last error seems caused by this line of code https://github.com/NetworkConfigurat...control.c#L142 , in a file containing functions related to privilege separation control.

In order to get it back working i have to:

Code:

/etc/rc.d/rc.networkmanager stop
killall -9 dhcpcd
/etc/rc.d/rc.networkmanager start


marav 09-01-2021 11:25 AM

Quote:

Originally Posted by ctrlaltca (Post 6280581)
I am experiencing the same issue.

After a resume i see the following processes running:
Code:

dhcpcd    2993  0.0  0.0  3080  2580 ?        S    09:26  0:00 dhcpcd: wlan0 [ip4]
root      2998  0.0  0.0  3004  1968 ?        S    09:26  0:00 dhcpcd: [privileged actioneer] wlan0 [ip4]
dhcpcd    2999  0.0  0.0  2992  288 ?        S    09:26  0:00 dhcpcd: [control proxy] wlan0 [ip4]
dhcpcd  10199  0.0  0.0  3004  308 ?        S    18:02  0:00 dhcpcd: [BPF ARP] wlan0 192.168.1.104
dhcpcd  10200  0.0  0.0  3004  308 ?        S    18:02  0:00 dhcpcd: [BPF BOOTP] wlan0

Some of them are started at 09:26 (bootup time), some others at 18:02 (resume time).

The following messages are printed in syslog:
Code:

Sep  1 18:02:14 sprawl NetworkManager[1398]: <warn>  [1630512134.1552] device (wlan0): Activation: failed for connection 'offinf'
Sep  1 18:02:14 sprawl NetworkManager[1398]: <warn>  [1630512134.1587] dhcp-listener: dhcp-event: (pid 2993) unhandled DHCP event for interface wlan0
Sep  1 18:02:14 sprawl NetworkManager[1398]: <warn>  [1630512134.1621] dhcp-listener: dhcp-event: (pid 2993) unhandled DHCP event for interface wlan0
Sep  1 18:02:14 sprawl NetworkManager[1398]: <warn>  [1630512134.1679] dhcp-listener: dhcp-event: (pid 2993) unhandled DHCP event for interface wlan0
Sep  1 18:02:16 sprawl NetworkManager[1398]: <warn>  [1630512136.8853] dhcp-listener: dhcp-event: (pid 2993) unhandled DHCP event for interface wlan0
Sep  1 18:02:16 sprawl NetworkManager[1398]: <warn>  [1630512136.8905] dhcp-listener: dhcp-event: (pid 2993) unhandled DHCP event for interface wlan0
Sep  1 18:02:16 sprawl NetworkManager[1398]: <warn>  [1630512136.8956] dhcp-listener: dhcp-event: (pid 2993) unhandled DHCP event for interface wlan0
Sep  1 18:02:19 sprawl dhcpcd[2998]: ps_ctl_dispatch: cannot handle another client

EDIT: the last error seems caused by this line of code https://github.com/NetworkConfigurat...control.c#L142 , in a file containing functions related to privilege separation control.

In order to get it back working i have to:

Code:

/etc/rc.d/rc.networkmanager stop
killall -9 dhcpcd
/etc/rc.d/rc.networkmanager start


I also observe this behavior
But only with the Xwayland session
No problem under X11 session

pghvlaans 09-01-2021 11:30 AM

Quote:

Originally Posted by ctrlaltca (Post 6280581)
In order to get it back working i have to:

Code:

/etc/rc.d/rc.networkmanager stop
killall -9 dhcpcd
/etc/rc.d/rc.networkmanager start


Ah, that seems to work. Thank you! I assume the addition of the new user was done for a reason, so I'll make that an alias for now.

ctrlaltca 09-01-2021 11:36 AM

It seems like networkmanager doesn't kill the old dhcpcd while suspending, and after resume it tried to start a new client that fails since another instance of dhcpcd is already rinning.
Changing the dhcp client used by networkmanager in /etc/NetworkManager/conf.d/00-dhcp-client.conf could workaround this, but i suppose that there's a root cause that needs proper fixing here.
It's not really solved yet.

EDIT: i can confirm that setting "dhcp=internal" in /etc/NetworkManager/conf.d/00-dhcp-client.conf is a workaround for the problem

pghvlaans 09-01-2021 12:17 PM

"dhcp=internal" is working here as well.

marnold 09-01-2021 01:55 PM

Phew! Glad I'm not the only one. I ended up rebooting to get my network back. Rather...inelegant.

Quote:

Originally Posted by ctrlaltca (Post 6280590)
EDIT: i can confirm that setting "dhcp=internal" in /etc/NetworkManager/conf.d/00-dhcp-client.conf is a workaround for the problem

Glad there's a temporary workaround! What is that line actually instructing NetworkManager to do? Will that be A Bad Thing when a patch is released?

ctrlaltca 09-01-2021 02:24 PM

Network manager can use different methods to get an ip address using dhcp:
1. using the external program dhcpcd
2. using the external program dhclient
3. using an internal implementation in networkmanager

The networkmanager guys recommend using option 3, but Slackware defaults to option 1 since it's the same dhcp used by rc init scripts.
Since the problem here seems to be interaction between networkmanager and dhcpcd, switching to the internal implementation avoids the problem.
If you switch to using option 3, even if the main problem gets a patch in Slackware, you won't probably notice.

jostber 09-02-2021 03:41 AM

I had the same problem at the Aug 31 update. Thanks for the dhcp=internal solution.

alex14641 09-02-2021 05:41 AM

Just out of curiosity: when this problem occurs, do all the name servers in your /etc/resolv.conf have IPv6 addresses?

pghvlaans 09-02-2021 10:30 AM

Mine just disappeared, but I didn't have anything IPv6 to begin with.

Chuck56 09-02-2021 11:22 AM

Good thread! I think I have a related challenge. I have the same syslog error messages on a remote gateway running -current.

It started July 20th after a full run with slackpkg over an openvpn connection. Based on the logs it looks like I rebooted the gateway then a couple hours later NetworkManager lost connectivity with an error "wlan0: failed to renew DHCP". The interface cascades multiple errors then fails with a loss of network. I've been physically at the location a couple times & have to hard boot the gateway to regain functionality. The error process starts over again and is currently not responding remotely.

The gateway manages eth0 with an inet1.conf static IP. NM ignores eth0 using the unmanaged-devices setting in NetworkManager.conf. It also has an on demand tun0 when the openvpn connection is activated. Both of those interfaces seem to run as expected.

I'm going to try the dhcp=internal setting to see if that corrects the NM dhcp error messages and stabilizes wlan0 over the weekend.

Fingers crossed!

chrisretusn 09-03-2021 10:11 AM

Well glad to hear I'm not the lone stranger. Had this issue with my laptop that I hibernate. I tried the dhcp=internal setting, it does not work for me. I still have to reboot or:
Code:

/etc/rc.d/rc.networkmanager stop
killall -9 dhcpcd
/etc/rc.d/rc.networkmanager start


Chuck56 09-03-2021 04:17 PM

Quote:

Originally Posted by Chuck56 (Post 6280839)
...I'm going to try the dhcp=internal setting to see if that corrects the NM dhcp error messages and stabilizes wlan0 over the weekend...

Happy to report it's working for my remote gateway. I'll continue to monitor just in case.

brobr 09-05-2021 04:10 AM

Hi, got this as well the other day; seems not have changed by setting "dhcp=internal".
After recompiling the NetworkManager modules from SBo I had for various vpn methods, the wifi can connect again after resume on my box. Maybe I am not the only one forgetting to recompile such modules after an upgrade even it might be the obvious thing to do.

EDIT: red herring; sorry. Resume still cannot restore wifi irrespective of dhcp= setting.


All times are GMT -5. The time now is 05:50 PM.