LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)
-   -   Strange tcp/ip problem (https://www.linuxquestions.org/questions/linux-networking-3/strange-tcp-ip-problem-149542/)

coontie 02-23-2004 10:17 AM

Strange tcp/ip problem
 
Hi all. Here's my weird problem. I have 4 machines on the same hub: a,b,c,d. A,b,c are HP proliant and d is an ibm box. Telneting between abc is fine and from abc to d is fine also. However, telnetting from d to ab takes an extremely long time, it hangs for about 15 secs until you get the banner. From d to c its ok, though! (abc are all built identically)

D->ab is problematic with ssh, ftp and our custom written app that uses tcp. ssh and ftp eventually recover and present with a login prompt but our custom written app dies.

Any ideas about where to start troubleshooting this? I did tcpdump port 23 on {a,b} and i see the initial handshake happening and then more stuff but nothing that points to anything.

ABC machines run RH AS 2.1. D runs RH 7.3.

iptables are disabled everywhere, there's nothing in /var/log/messages. I tried upgrading from tg3 driver to bcm5700 but the problem still persists. In fact, I tried hooking up a laptop directly with a crossover to machine A and still the problem is there. So, its not the hub or the cable.

Help.

bastard23 02-23-2004 10:23 AM

As a quick guess, I would look at your DNS configuration. Are you using any other name services (NIS, LDAP)? Do all the machines have a reverse lookup? Try doing a dump and look at the DNS traffic.

Good Luck,
chris

coontie 02-23-2004 10:39 AM

well, nsswitch.conf reads files dns nisplus. Are you saying take dns/nisplus out? but ALL machines have that, including the one that works. In tcpdump i can see some weird arp requests going out. Could ARP be the problem? How??

bastard23 02-23-2004 11:03 AM

Could be an ARP problem if somebody isn't responding... but I doubt it. Lets look at DNS first.

Are you using a DNS server or just the /etc/hosts file?
Do you use NIS? If not, take it out of nsswitch.conf.

Try running the command 'host <ipaddress>' for all the ip address.

you should get something like this, where the ip address is reversed.

1.0.0.10.IN-ADDR.ARPA domain name pointer hostname.domainname

This will tell us if the reverse DNS is working for all the machines.

What exactly is ARP doing? You should see a "Who has" request, and then a respose, when connect to the machine.

Good Luck,
chris

coontie 02-23-2004 11:06 AM

there's no dns server and no NIS. I'll try the commands and let you know.

as for ARP: i see who has... tell nn.nn.nn.nn something but no reply but thats for a some other IP altogether. I've no idea why it wants to know that during telnet.

bastard23 02-23-2004 11:50 AM

I would take out the dns entry in nsswitch.conf only if you are not ever going to use dns on these machines, i.e. no internet. (If you do, make note of it in /etc/resolv.conf so you can fix it later)
Verify that /etc/hosts has both the canonical and short names for every host ( 10.0.0.1 hostname.domainname hostname ).
Verify that /etc/hosts is the same on all the boxes.
Make sure files is the first entry for nsswitch.conf.
Do any of the machines have a DNS server in /etc/resolv.conf?

You may want to look at the inetd configuration for telnet. It probably uses tcpd. I'm not positive, but it may be trying to issue a DNS request. Perhaps to a non existant machine? Maybe this is the ARP request.

I'm still guessing, but in my experience, most timeouts on a local network happen because of DNS.

Hope that helps,
chris


All times are GMT -5. The time now is 02:21 AM.