LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)
-   -   Some (but not all) network hosts can't talk to each other (https://www.linuxquestions.org/questions/linux-networking-3/some-but-not-all-network-hosts-cant-talk-to-each-other-4175677928/)

dlanced 06-30-2020 08:17 PM

Some (but not all) network hosts can't talk to each other
 
I'm having trouble with my local network. All the Linux hosts get DHCP addresses and connectivity. The problem is that they're not always able to communicate (SSH, ping) with each other. I've got two WiFi routers configured on channels far apart from each other. One router (my ISP's device) is at 192.168.1.1 and my TPLink AC1200 router is configured as an access point on the same network at 192.168.1.2.

Right now I've got a Raspberry Pi that was assigned 192.168.1.16 and has internet connectivity, but can't be seen or accessed from my primary workstation - even when it's connected through the same router (that would be the TPLink, at the moment). And I can't SSH from the Pi to my workstation, But my laptop - on the same network and logged in through the same router - can SSH in and pick the Pi up on nmap.

Another issue that I think may be related is that, for some reason, I'm not always able to refer to all the hosts by "hostname.local" - although they'll generally work using the IP address (except when they're "off" the network, of course).

Any ideas?
Thanks,

ferrari 06-30-2020 08:28 PM

Quote:

Right now I've got a Raspberry Pi that was assigned 192.168.1.16 and has internet connectivity, but can't be seen or accessed from my primary workstation
Can you ping the Pi host via the primary workstation successfully?
Code:

ping 192.168.1.16
With respect to Avahi...
Code:

avahi-browse -art
*You may need to discover avahi-utils package first.

Check for active firewalls on the host you're trying to reach.

dlanced 06-30-2020 08:36 PM

1 Attachment(s)
Quote:

Originally Posted by ferrari (Post 6139934)
Can you ping the Pi host via the primary workstation successfully?
Code:

ping 192.168.1.16

Nothing:
Code:

$ ping 192.168.1.16
PING 192.168.1.16 (192.168.1.16) 56(84) bytes of data.
From 192.168.1.13 icmp_seq=1 Destination Host Unreachable

Quote:

With respect to Avahi...
Code:

avahi-browse -art
*You may need to discover avahi-utils package first.
I've attached the avahi output: Attachment 33549
As you can see, this is a busy machine - lots of LXD bridges etc.

Quote:


Check for active firewalls on the host you're trying to reach.
[/QUOTE]
Good point. I forgot to mention that there are no firewalls on the local network.
Thanks,

ferrari 06-30-2020 08:58 PM

Can you confirm the ip assignments on the pi at this time?
Code:

ip address
Code:

ip route

dlanced 06-30-2020 09:05 PM

Quote:

Originally Posted by ferrari (Post 6139946)
Can you confirm the ip assignments on the pi at this time?
Code:

ip address

Code:

$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
      valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
      valid_lft forever preferred_lft forever
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
    link/ether b8:27:eb:2e:69:db brd ff:ff:ff:ff:ff:ff
3: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether b8:27:eb:7b:3c:8e brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.16/24 brd 192.168.1.255 scope global dynamic noprefixroute wlan0
      valid_lft 85877sec preferred_lft 75077sec
    inet6 fe80::9181:da2:27d5:37e/64 scope link
      valid_lft forever preferred_lft forever

Quote:

Code:

ip route

Code:

$ ip route
default via 192.168.1.1 dev wlan0 proto dhcp src 192.168.1.16 metric 303
192.168.1.0/24 dev wlan0 proto dhcp scope link src 192.168.1.16 metric 303


ferrari 06-30-2020 09:13 PM

I know you said there was no firewall on any host....but just to be sure, run this from the Pi...
Code:

sudo iptables -S

ferrari 06-30-2020 09:17 PM

Regarding this comment...
Quote:

I've got two WiFi routers configured on channels far apart from each other. One router (my ISP's device) is at 192.168.1.1 and my TPLink AC1200 router is configured as an access point on the same network at 192.168.1.2.
Can you confirm that both hosts are connected to the same wifi AP?
Code:

iw dev wlan0 link
or via older iwconfig command...
Code:

iwconfig wlan0

dlanced 06-30-2020 09:18 PM

Quote:

Originally Posted by ferrari (Post 6139948)
I know you said there was no firewall on any host....but just to be sure, run this from the Pi...
Code:

sudo iptables -S

This is a completely fresh install on the Pi:

Code:

$ sudo iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT

The truth is that the Pi is just a current example of the problem. I've been having similar issues with other hosts on this LAN.

ferrari 06-30-2020 09:25 PM

If you have hosts connected via different wifi AP's this might present an issue (eg multicast traffic being blocked), although pinging should work with no issue. Some APs have features to keep wifi client hosts from being able to talk directly to each other as well.

Just in case this is applicable...
https://www.tp-link.com/us/support/faq/2089/

dlanced 06-30-2020 10:35 PM

Quote:

Originally Posted by ferrari (Post 6139956)
If you have hosts connected via different wifi AP's this might present an issue (eg multicast traffic being blocked), although pinging should work with no issue. Some APs have features to keep wifi client hosts from being able to talk directly to each other as well.

Just in case this is applicable...
https://www.tp-link.com/us/support/faq/2089/

That is interesting. My TPLink model doesn't seem to have that function. But in any case, in at least my current setup, all three hosts (my workstation, my laptop, and the Pi) were attached to the same router - but only two were able to talk to each other.
Thanks,

dlanced 06-30-2020 11:00 PM

Just a quick update: A couple of hours later, the Pi suddenly became available for SSH from my workstation. This is exactly the unpredictable behavior I've been experiencing with other hosts. Is it possible that there's something buggy about the way ARP is caching my host addresses and, for some hosts, it's taking a very long time? Full disclosure: I don't know all that much about ARP, but I'd like to hear what people who do understand it think.

tinfoil3d 06-30-2020 11:06 PM

dlanced, as ferrari mentioned, I also have a strong suspicion if both of your wifi AP have same keys and name then your systems might roam between them sometimes which would cause them to split. Make sure the devices are connected to the same AP.

dlanced 06-30-2020 11:10 PM

Thanks. I have indeed seen some flipping between APs over the past months.
But all three of the devices I've been watching through the past six hours have been on the same AP. Besides that, why should they split even if they are one different APs? They're all on the same network.

tinfoil3d 06-30-2020 11:33 PM

If your tp-link isn't configured as a bridge then they'd be on the same network the same way my 192.168.1.2 is on the same network as you.

ferrari 07-01-2020 03:08 AM

Yes, I agree... focus your attention on the second AP device. I assume it is connected via ethernet? (Refer below link)
https://www.tp-link.com/en/support/faq/417/

WDS Bridging is described here
https://www.tp-link.com/us/support/faq/440/

dlanced 07-01-2020 09:12 AM

Quote:

Originally Posted by ferrari (Post 6140018)
Yes, I agree... focus your attention on the second AP device. I assume it is connected via ethernet? (Refer below link)
https://www.tp-link.com/en/support/faq/417/

WDS Bridging is described here
https://www.tp-link.com/us/support/faq/440/

The first thing I noticed on the first page you linked is that my AP's IP should be outside the DHCP range of the main router. So I limited that DHCP to 192.168.1.2 - 192.168.1.200 and gave my AP a static address of 192.168.1.201. So far everything seems to be working, although I'll keep a close eye on things through the rest of the day.
So far I'm optimistic. Thanks!

dlanced 07-01-2020 10:03 AM

Well, that didn't go well. All my computers are now happily connected to the routers and the LAN, but they've all lost internet connectivity. All, that is, except my laptop (which I'm using now).
What I can see is that all the other devices have been issued new IPs (192.168.1.101 and up) - except for this one. The ISP router has a proper public IP and it looks perfect. But no other device has actual internet access. Any ideas?
I'm afraid the IP of this laptop could be renewed any time which will kill this access, too. :)

michaelk 07-01-2020 10:21 AM

I assume the AP is hard wired to the main router?

As noted in the FAQ make sure the cable is plugged into a LAN port of the AP not WAN.

Make sure the AP's LAN address is configured correctly and is outside the DHCP range of the main router.

Make sure AP's DHCP server is disabled.

dlanced 07-01-2020 10:35 AM

Quote:

Originally Posted by michaelk (Post 6140170)
I assume the AP is hard wired to the main router?

It is.

Quote:

As noted in the FAQ make sure the cable is plugged into a LAN port of the AP not WAN.
Yup.

Quote:

Make sure the AP's LAN address is configured correctly and is outside the DHCP range of the main router.
It certainly was (although I've restored the original setting to see if it would give me back my internet access).

Quote:

Make sure AP's DHCP server is disabled.
Oh. Somehow DHCP was enabled on the AP. I have no clue how that happened...but it did happen.
Fixed now and everything looks good.
<Whew>

ferrari 07-01-2020 02:22 PM

Glad that you found the issue. (It was mentioned in the guides as well).

dlanced 08-04-2020 09:46 AM

I'm afraid I'm now back to where I started! There were a few weeks where all my NAT hosts could happily talk to each other, regardless of which AP they were connected to. But now some just aren't showing up. My Raspberry Pi, for instance, just doesn't appear in nmap results from my main workstation, but is getting its old DHCP IP from the primary router and can at least sometimes be seen from other hosts. And yesterday, I could only access one of my local hosts after having SSH'd in to it from a third host, and then SSH'd from there to my workstation!
So the routers are wired properly, the AP has a static IP (192.168.1.201) outside the DHCP range, and everyone has internet access. I can't think of anything I did to change the network configuration in the meantime, but I'd appreciate if anyone has any more ideas!
Thanks,

michaelk 08-04-2020 11:59 AM

You have not changed any physical configurations or settings?

Are the missing hosts connected to AP or mixed wired and wireless?

dlanced 08-04-2020 12:15 PM

Quote:

Originally Posted by michaelk (Post 6152636)
You have not changed any physical configurations or settings?

I've been trying to remember, but I can't think of anything. I know I hadn't even logged into either of the routers for weeks. I did just move the ethernet cable from one LAN port to another LAN port on the AP, but it didn't help. That, by the way, was in honor of my very first major networking problem many, many years ago when it turned out that one of the ports on a cheap hub I was using was flaky. :)

Quote:

Are the missing hosts connected to AP or mixed wired and wireless?
Right now I can't access a host that's connected to the same WiFi AP as my workstation:
Code:

$ ssh 192.168.1.7
ssh: connect to host 192.168.1.7 port 22: No route to host


michaelk 08-04-2020 12:49 PM

There are several reasons you might see a no route to host message.

If you can ping the computer then you might not be connecting to the correct computer. Check the address in the router.
If the IP matches then ssh might not be running.
If ssh running are you using some port other then 22
If ssh is running then it could be a firewall setting which is probably not it.

dlanced 08-04-2020 12:52 PM

Quote:

Originally Posted by michaelk (Post 6152647)
There are several reasons you might see a no route to host message.

If you can ping the computer then you might not be connecting to the correct computer. Check the address in the router.
If the IP matches then ssh might not be running.
If ssh running are you using some port other then 22
If ssh is running then it could be a firewall setting which is probably not it.

Thanks. The IPs are correct and SSH is running at default settings on all machines. Everything is tested. No firewalls.

dlanced 08-08-2020 09:58 PM

I think I tracked down the real problem causing me all this grief. I tried power cycling the TP-Link WiFi router and I got my full network back. I think something's falling over on the router over time that can be fixed with a reboot. I might contact TP-Link (it's still under warranty) to see what they have to say about this.
Sometimes it just comes down to bad hardware.
Thanks to everyone around here who offered help on this!


All times are GMT -5. The time now is 02:59 AM.