Insanly weird issue trying to access bugs.gentoo.org from my NATed system [SOLVED]
For some reason, I can't access bugs.gentoo.org and packages.gentoo.org (i'm sure there's other pages, I just haven't found them yet). I get timed out. However, using links on my server, the page comes up just fine.
Pinging the site works just fine aswell, so DNS and the site appear to be online no problem. Here's the tracepath output from my workstation: valexia ~ # tracepath bugs.gentoo.org 1: infinte.lan (10.2.0.13) 0.238ms pmtu 1500 1: 10.2.0.1 (10.2.0.1) 1.293ms 2: no reply 3: no reply 4: no reply 5: no reply 6: no reply 7: no reply 8: 1-8-ftth.onsnetstudenten.nl (145.120.8.1) asymm 9 2.676ms 9: v2-11-1.1522.XSR01.Amsterdam1a.surf.net (145.145.12.173) asymm 11 9.957ms 10: FT-500.XSR03.Amsterdam1A.surf.net (145.145.80.34) asymm 12 7.027ms 11: surfnet.rt1.ams.nl.geant2.net (62.40.124.157) asymm 12 6.367ms 12: so-7-0-0.rt1.nyc.us.geant2.net (62.40.112.134) asymm 14 89.587ms 13: 198.32.11.50 (198.32.11.50) asymm 15 347.233ms 14: chinng-nycmng.abilene.ucaid.edu (198.32.8.82) asymm 16 110.234ms 15: iplsng-chinng.abilene.ucaid.edu (198.32.8.77) asymm 17 113.470ms 16: kscyng-iplsng.abilene.ucaid.edu (198.32.8.81) asymm 18 125.935ms 17: dnvrng-kscyng.abilene.ucaid.edu (198.32.8.13) asymm 19 149.538ms 18: snvang-dnvrng.abilene.ucaid.edu (198.32.8.1) asymm 20 158.322ms 19: pos-1-0.core0.eug.oregon-gigapop.net (198.32.163.17) asymm 20 170.503ms 20: nero.eug.oregon-gigapop.net (198.32.163.151) asymm 22 171.237ms 21: eugn-core1-gw.nero.net (207.98.64.168) asymm 23 171.063ms 22: corv-car1-gw.nero.net (207.98.64.6) asymm 24 176.050ms 23: no reply 24: no reply 25: no reply 26: no reply 27: no reply 28: no reply 29: no reply 30: no reply 31: no reply Too many hops: pmtu 1500 Resume: pmtu 1500 From my linksys wrt that does the nat: root@distribution_node_2:~# traceroute bugs.gentoo.org traceroute to bugs.gentoo.org (140.211.166.163), 30 hops max, 40 byte packets 1 * * * 2 * * * 3 * * * 4 * * * 5 * * * 6 * * * 7 1-8-ftth.onsnetstudenten.nl (145.120.8.1) 4.231 ms 4.147 ms 3.978 ms 8 v2-11-1.1522.xsr01.amsterdam1a.surf.net (145.145.12.173) 5.361 ms 4.961 ms 6.697 ms 9 FT-500.XSR03.Amsterdam1A.surf.net (145.145.80.34) 5.837 ms 5.344 ms 5.545 ms 10 surfnet.rt1.ams.nl.geant2.net (62.40.124.157) 5.814 ms 5.388 ms 5.265 ms 11 so-7-0-0.rt1.nyc.us.geant2.net (62.40.112.134) 88.692 ms 88.277 ms 88.324 ms 12 198.32.11.50 (198.32.11.50) 395.063 ms * 378.821 ms 13 chinng-nycmng.abilene.ucaid.edu (198.32.8.82) 108.683 ms 108.416 ms 108.761 ms 14 iplsng-chinng.abilene.ucaid.edu (198.32.8.77) 124.37 ms 112.213 ms 112.384 ms 15 kscyng-iplsng.abilene.ucaid.edu (198.32.8.81) 129.075 ms 125.164 ms 121.459 ms 16 dnvrng-kscyng.abilene.ucaid.edu (198.32.8.13) 136.565 ms 143.985 ms 145.959 ms 17 snvang-dnvrng.abilene.ucaid.edu (198.32.8.1) 166.378 ms 157.729 ms 157.022 ms 18 pos-1-0.core0.eug.oregon-gigapop.net (198.32.163.17) 169.809 ms 169.029 ms 169.134 ms 19 nero.eug.oregon-gigapop.net (198.32.163.151) 170.403 ms 169.213 ms 169.422 ms 20 eugn-core1-gw.nero.net (207.98.64.168) 169.818 ms 170.109 ms 169.393 ms 21 corv-car1-gw.nero.net (207.98.64.6) 170.448 ms 170.234 ms 170.17 ms 22 * * * 23 * * * 24 * * * 25 * * * 26 * * * 27 * * * 28 * * * 29 * * * 30 * * * Note: The server I used to 'links' the page, is a different system, not the linksys wrt. I didn't find an ipkg for links or lynx for openwrt. Note2: the infinite.lan part might be one part where it goes wrong, but afaik setting the hostname and /etc/hosts 127.0.0.1 <hostname> should be enough I always thought? Note3: My 'external' IP is in the 145.120.13.* range, so my gateway is in a different subnet, but don't think that should matter ... As a refference, here's the tracepath for www.gentoo.org (which *does* work) from my workstation: valexia ~ # tracepath www.gentoo.org 1: infinte.lan (10.2.0.13) 0.249ms pmtu 1500 1: 10.2.0.1 (10.2.0.1) 2.778ms 2: no reply 3: no reply 4: no reply 5: no reply 6: no reply 7: no reply 8: 1-8-ftth.onsnetstudenten.nl (145.120.8.1) asymm 9 12.406ms 9: v2-11-1.1522.XSR01.Amsterdam1a.surf.net (145.145.12.173) asymm 11 9.970ms 10: FT-500.XSR03.Amsterdam1A.surf.net (145.145.80.34) asymm 12 7.200ms 11: gsr12416.ams.he.net (195.69.145.150) asymm 12 6.192ms 12: pos0-0.gsr12416.lon.he.net (216.66.24.157) 14.590ms 13: pos4-1.gsr12416.nyc.he.net (216.218.200.101) asymm 14 84.128ms 14: pos7-0.gsr12012.sjc.he.net (216.218.254.153) asymm 15 163.219ms 15: he.sjc.vrix.net (216.218.196.26) asymm 16 163.808ms 16: 209.177.159.21 (209.177.159.21) asymm 21 159.191ms 17: no reply 18: no reply 19: no reply 20: no reply 21: no reply 22: no reply 23: no reply 24: no reply 25: no reply 26: no reply 27: no reply 28: no reply 29: no reply 30: no reply 31: no reply Too many hops: pmtu 1500 Resume: pmtu 1500 |
If you are performing a traceroute ... your first hop should be your gateway.
You might want to describe how the internal network is setup in more detail. |
What do you mean your "gateway is on a different subnet"? By definition your gateway must be on the same subnet as your network interface.
|
Quote:
My network layout is as follows: Linksys wrt54gS, running openwrt WhiteRussian RC4 I belive. It's beeing running this for more than a year, the problem describes is from the last 72 hours (give or take). the wrt54gs, distribution_node_2 (i'll refer to it as wrt) has one WAN port, and 4 LAN ports. These 6 ports are devided into 2 vlan's, vlan0 and vlan1. vlan0 consists of the WAN port +LAN_1 LAN_2 ports. Thus, i get an external (direct) IP on LAN_1 and LAN_2 ports. Thus the wrt (vlan0) and enterprise (LAN_1/eth0) get their unique IP. vlan1 consists eth1 (wifi part) +LAN_3 and LAN_4 ports. The following image describes almost the same vlan setup, the original setup. WRT54G on LAN_4 is a plain (gigabit) switch connected. The switch connects all my (natted) Systems. My laptop annika, either via wifi or gigabit on the switch(vlan1)[nat]. My desktop valexia, via gigabit on the switch(vlan1)[nat]. My freevopc erin, via gigabit on the switch(vlan1)[nat]. My server(again) enterprise(eth1), via gigabit on the switch(vlan1)[nat]. the first 3 can't access bugs.gentoo.org or packages.gentoo.org (but forums.gentoo and www.gentoo.org are ok). It appears to be something in my router, can't imagine all 3 pc's have a messed up ...something. |
Quote:
After some digging, (which was sloppy the first time I suppose) route prints indeed out the proper gateway an all my machines. The reason i mentioned 'different subnet' for the gateway was: 1: infinte.lan (10.2.0.13) 0.238ms pmtu 1500 1: 10.2.0.1 (10.2.0.1) 1.293ms 2: no reply 3: no reply 4: no reply 5: no reply 6: no reply 7: no reply 8: 1-8-ftth.onsnetstudenten.nl (145.120.8.1) asymm 9 2.676ms Now step 2 is most very likly my gateway, 145.120.13.1 as it is shown by route. Somehow it goes through a couple of no replies to 8. I took the wrong assumption that 2 - 7 was something that was causing the problem or ... I dunno, you know what they say, assumption is the mother of all *ups right? |
tracer[ou]t[e]/tracepath/ping are increasingly irrelevant for trouble-shooting since so many routers filter ICMP (either the echo-reqs, or the time-exceededs). They're pretty much only useful on local networks to confirm that your packets are going to the correct internal router/firewall.
If you can resolve the names from any workstation than it cannot be a DNS problem. Try running # tcpdump host <ip of gentoo site> on one of the affected workstations while you fire up your browser and attempt to access that site. See if there are indeed packets going to the correct IP on the correct port. Try running tcpdump on the external interface of your linksys as well to see if the return packets are coming back as expected. At first-glance it appears that it might be some sort of sub-netting problem resulting in triangular routing. The fact that the sites you can reach are on different subnets than the sites you can't reach tends to point to this. |
Quote:
On the 10.0 internal lan on my wrt, i'm also running a openvpn server (bridged with my other interlan stuff) the thing is, ICMP should go through just fine, on both enterprise, and the openvpn clients (though I hardly think that that is the issue. I ran tcpdump, and got some binary jibberish, i loaded into wireschak, and it could make sense out of it. But first I gotta disconnect/stop stuff to get a clearer result, I got some ssh packets I don't want, as to some other packets. Also i'm going to (abuse) my openvpn to get an ip from one of my clients, that way, i'll be browsing the net via them. (Which works just fine, done it a few times, sometimes by accident) |
You can do tcpdump port not 22 to get rid of ssh traffic.
The VPN raises my suspicions even more about triangular routing. Is it possible that you have a customer using the same subnet as the gentoo sites on their internal network, i.e. that they're improperly advertising routes to an internet-routable block over the VPN? |
Quote:
So I stopped the openvpn server on my wrt, disconnected the LAN connection from my switch to enterprise, so that my desktop was the only one on the wrt, and the wrt didn't know anything about anybody else and the internet. Still no go. I now re-ran wireshark/tcpipdump on this nakid configuration. valexia (desktop) No. Time Source Destination Protocol Info 1 0.000000 10.2.0.13 10.2.0.1 DNS Standard query AAAA bugs.gentoo.org 2 0.001676 10.2.0.1 10.2.0.13 DNS Standard query response 3 0.006257 10.2.0.13 10.2.0.1 DNS Standard query AAAA bugs.gentoo.org.lan 4 0.007697 10.2.0.1 10.2.0.13 DNS Standard query response, No such name 5 0.007737 10.2.0.13 10.2.0.1 DNS Standard query A bugs.gentoo.org 6 0.009274 10.2.0.1 10.2.0.13 DNS Standard query response A 140.211.166.163 7 0.009567 10.2.0.13 140.211.166.163 TCP 35026 > http [SYN] Seq=0 Len=0 MSS=1460 TSV=10251116 TSER=0 WS=6 8 3.008602 10.2.0.13 140.211.166.163 TCP 35026 > http [SYN] Seq=0 Len=0 MSS=1460 TSV=10254116 TSER=0 WS=69 9.007428 10.2.0.13 140.211.166.163 TCP 35026 > http [SYN] Seq=0 Len=0 MSS=1460 TSV=10260116 TSER=0 WS=6 10 14.006435 AsustekC_68:af:b9 Cisco-Li_83:c1:05 ARP Who has 10.2.0.1? Tell 10.2.0.13 11 14.006873 Cisco-Li_83:c1:05 AsustekC_68:af:b9 ARP 10.2.0.1 is at 00:13:10:83:c1:05 12 21.005070 10.2.0.13 140.211.166.163 TCP 35026 > http [SYN] Seq=0 Len=0 MSS=1460 TSV=10272116 TSER=0 WS=6 13 45.000363 10.2.0.13 140.211.166.163 TCP 35026 > http [SYN] Seq=0 Len=0 MSS=1460 TSV=10296116 TSER=0 WS=6 14 49.999372 AsustekC_68:af:b9 Cisco-Li_83:c1:05 ARP Who has 10.2.0.1? Tell 10.2.0.13 15 49.999811 Cisco-Li_83:c1:05 AsustekC_68:af:b9 ARP 10.2.0.1 is at 00:13:10:83:c1:05 On the WRT (msg 3 and 4 are incoming UDP vpn requests, to re-establish the link with the tunnel, so should be ignored I think) No. Time Source Destination Protocol Info 1 0.000000 145.120.13.xxx 140.211.166.163 TCP 40834 > http [SYN] Seq=0 Len=0 MSS=1460 TSV=10988228 TSER=0 WS=6 2 2.999068 145.120.13.xxx 140.211.166.163 TCP 40834 > http [SYN] Seq=0 Len=0 MSS=1460 TSV=10991228 TSER=0 WS=6 3 4.589583 82.134.233.xxx 145.120.13.xxx UDP Source port: 1265 Destination port: openvpn 4 4.590026 145.120.13.xxx 82.134.233.xxx ICMP Destination unreachable (Port unreachable) 5 8.998241 145.120.13.xxx 140.211.166.163 TCP 40834 > http [SYN] Seq=0 Len=0 MSS=1460 TSV=10997228 TSER=0 WS=6 I did reboot the wrt a few times, just not for this last test, so the DNS query probably comes from the built in cache, which is why it isn't showing up. |
Chort, your not giving up on me are you?
It get's better. I have put everything as I had it all is back connected. I launched a virtual machine, with bridged networking using the latest ubuntu 6.10 install cd. While running the installer i 'accidentally' started firefox and went browsing a bit. After a few minutes I checked bugs.gentoo.org which worked. Going back to my own desktop, no go. Turns out I got an IP from one of the vpn points. no biggie, as all that happens is i go on the internet from that IP. So I force an IP in my own subnet, with my own AP as gateway, and bugs.gentoo.org don't work anymore. stupid thing remains, i can ping it, and get replies from the correct IP address ... |
So either that particular host has a firewall that's dropping your http traffic as some kind of blacklist, or perhaps you have a proxy of some sort that is not correctly handing headers from that host? If you can ping it, but you can't make an http connect, it's nothing to do with routing. It's something farther up the network stack.
|
if it's blacklisting, change of IP should fix that I suppose. proxy .. the vpn endpoint I did get to use is running the same settings as my wrt is (they are both wrt's, running openwrt) ...
Gonna try via telnet to get some info from port 80 from that host. Gotta refresh my memory on that first, been a while :) |
You weren't getting any packets back from your HTTP request, at least your packet captures didn't see any. I don't know what interface you were running the captures on. Presumably if you run the capture on the exernal interface of your firewall you should be able to see the return packets even if your own internal networking/routing/proxy/whatever is screwed up.
Er, my point is doing a telnet to port 80 is going to result in packets disappearing and never coming back, just like when you used the web browser. |
Well the packet dump came from the external nic (vlan1, the one with the IP anyway) on my WRT. now i can't open links or lynx on my wrt (no pkg that I could find). So .. it's like a last thing I can try before changing the IP (which is gonna be a pita I think)
So I got some new things to try, cause this is very frustrating. |
Solved.
Well not really, but it now works for me. Guess 145.120.13.188 is blokked for some reason. Thing is, I didn't even had that IP for that long (My ISP rotates IP and does some weird magic, cause they think it'll help against spam, don't even ask, people agree it's silly). So I took a different IP, and it all is working happily for me again. |
All times are GMT -5. The time now is 01:29 PM. |