Linux - NetworkingThis forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a perplexing problem for which I would be very grateful for some assistance.
We are a small predominantly Windows site with a few Linux machines (Red Hat ES 3) and about 70 Windows Servers (mixture of physical and virtual).
When connecting TO a Linux machine via FTP, authentication is taking about 40 seconds (very slow). The username and password prompts come up instantly. I put in the password, press enter and then it hangs for approx 40 seconds before it eventually connects fine. Transferring files via FTP, once connected, is fast. It's just the authemtication that is slow.
To help the diagnosis, I set up a simple shell script executed 24x7 every 5 minutes by cron, which connects to another linux box via FTP. It records the elapsed time to authenticate.
I have established the following facts:
(1) Authenticating via FTP from Windows to a Windows machine is always instantaneous.
(2) Authenticating via FTP from Windows to a Linux machine is almost always very slow.
(3) Authenticating via FTP from Linux to a Linux machine is almost always very slow
(4) We have occasional periods lasting between 10 and 15 minutes each time, where authenticating via FTP from Linux to a Linux machine is instanteneous. There is no apparent pattern to when these periods occur, but they always last 10 to 15 minutes. For example one day we had 7 of the "fast" periods, another day just 2.
(5) We have 3 Linux machines. When we have one of these "fast" periods, it is fast for all the Linux machines. (2 of the Linux machines are physical, one is virtual).
So there we have it. The evidence suggests to me that the problem is caused by something external to the Linux machines because all the Linux machines seem to be affected (fast or slow) at the same time. We have a proxy server, DNS server and firewalls here, but I'm not a network administrator so I'm not well up on those things.
Any ideas gratefully received - this is one wierd problem where Google has failed to help me!
Many Thanks!
Three possibilities;
- tcpdump to inspect traffic involved in your exchange
- More verbose debugging when executing the command.
- Inspect transactions using strace
You might get more information by using -d debug in your ftp command. At least it would show where you are hanging, or if any exceptional steps are involved.
wfh@wisp:~$ ftp -d ftp.smartftp.com
Connected to smartftp.com.
220 SmartFTP Server ready...
...etc.
Another *MUCH* more noisey solution is to run ftp using 'strace'...and be patient, <ctrl>+c when it "pauses", then cut-n-paste screen output (as opposed to logging), then examine what happens after you type your password and hit <enter>.
There will probably be something in there which is non-standard, or you will get an idea of what's happening during the long delay.
Thanks to wfh for the suggestions. Tcpdump is going to take a bit of getting to grips with (which I will do next), but in the meantime I have tried the "ftp -d" and "strace" suggestions. These both confirm that the thing hangs after the password is sent for authentication, and there don't seem to be any extra clues on there, at least so it seems to me.
Here are the interesting bits (I have globally editted hostnames and passwords for security reasons) - I will post any other clues from tcpdump later, when I can work out how to use it....
$ ftp -d fred
Connected to fred.int.bongo.co.uk.
220 fred.int.bongo.co.uk FTP server (Version 5.60) ready.
---> AUTH GSSAPI
334 Using authentication type GSSAPI; ADAT must follow
GSSAPI accepted as authentication type
Trying to authenticate to <ftp@fred.int.bongo.co.uk>
calling gss_init_sec_context
Trying to authenticate to <host@fred.int.bongo.co.uk>
calling gss_init_sec_context
GSSAPI error major: Miscellaneous failure
GSSAPI error minor: No credentials cache found
GSSAPI error: initializing context
GSSAPI authentication failed
---> AUTH KERBEROS_V4
334 Using authentication type KERBEROS_V4; ADAT must follow
KERBEROS_V4 accepted as authentication type
Kerberos V4 krb_mk_req failed: You have no tickets cached
Name (fred:dsdteam): root
---> USER root
331 Password required for root.
Password:
---> PASS XXXX
<< It hangs here for approx 40 seconds >>
230 User root logged in.
---> SYST
215 UNIX Type: L8
Remote system type is UNIX.
Using binary mode to transfer files.
ftp>
########
This is the strace output either side of the hang....
Things like this are very often caused by a DNS timeout when the FTP server is trying to resolve the connecting IP address back to a hostname. Worth checking that out.
Things like this are very often caused by a DNS timeout when the FTP server is trying to resolve the connecting IP address back to a hostname. Worth checking that out.
I agree, niloc. Perhaps a PTR record would speed things up. Although, why would this work *SOME* of the time if the root cause was DNS?
I agree, niloc. Perhaps a PTR record would speed things up. Although, why would this work *SOME* of the time if the root cause was DNS?
Not sure, but the fact that the Windows machines can always connect instantly to each other suggests that the problem is with just the Linux machines doing reverse DNS lookups.
I'd suggest logging into one of the Linux boxes that runs an FTP server and doing:
host 123.123.123.123
where 123.123.123.123 is the IP address of one of the connecting IP addresses. I suspect the command will timeout or take a while. Try it a few times.
Obviously not a fix, but to prove a point it might be worth putting an entry for a connecting IP address and hostname in to /etc/hosts on a Linux FTP server and then trying again. The corresponding client should then connect instantly.
Let's assume that it is a reverse lookup issue. If computerbongo edits his the Windows 'hosts' file (possibly at %SystemRoot%\system32\drivers\etc\), adding the name and IP of the Linux machines, the problem might go away. Then it's a matter of fixing DNS...adding a PTR for each Linux box.
Hello! Based on the suggestions above (from nilocj.d), and also some other ideas, I have tried the following...
(1) host 123.123.123.123 (the IP address of one of the connecting IP addresses) - the command completes instantly and comes back with the correct ip address.
(2)I chose a pair of Linux machines, and made sure that both /etc/host files had entries for both hostnames. - FTP authentication is still slow.
(3) ftp localhost - FTP authentication is still slow.
(4) ftp fred (where I am signed on to machine fred) - FTP authentication is still slow.
(5) ftp 127.0.0.1 - FTP authentication is still slow.
(6) I have put an "nslookup" into the 24x7 FTP shell script, immediately before and after the ftp connection. This has shown that it is always using the same DNS server to do the lookup, during the "fast" and "slow" FTP authentication periods.
Thanks wfh. I have used tcpdump and wireshark to try to work out what is happening. As far as I can tell. I have established the following facts:
(A) The delay is occurring on the server end, not the client end. Just in case I've got my terminology wrong, by this I mean if I am invoking the FTP application on host FRED and establishing a connection to remote host DELORES, then the delay is happening on DELORES.
(B) On the remote server (DELORES in the example) I get about 30 DNS protocol network packets being sent to the primary DNS server. The "info" column in the Wireshark display shows about 3 different types of similar description
[1] Standard query A kerberos.example.com
[2] Standard query AAAA kerberos.example.com
[3] Standard query AAAA kerberos.example.com.bongo.co.uk (where bongo.co.uk is our domain name, changed here for security reasons).
Now I don't know anything about kerberos, except that on the Linux machines it hasn't been intentionally set up. But the "kerberos.example.com" suggests an unconfigured kerberos setup (kerberos.example.com appears in the /etc/krb5.conf file which apparently is one of the places you configure kerberos).
I have stopped at this point as I don't want to go changing stuff to do with kerberos as I don't know what I'm doing, and I don't want to risk locking myself out of the machine by meddling with authentication from a position of ignorance.
Any ideas? Does kerberos need to be disabled in some way?
I have stopped at this point as I don't want to go changing stuff to do with kerberos as I don't know what I'm doing, and I don't want to risk locking myself out of the machine by meddling with authentication from a position of ignorance.
Any ideas? Does kerberos need to be disabled in some way?
Wow - I think it's finally solved!
I had set up all the machines to run gssftp via xinetd. I chose gssftp for no particular reason at the time. Apparently gssftp is "a kerberized xinetd-based FTP daemon which does not pass authentication information over the network". The big clue being the word "kerberized". I have now disabled gssftp on the 2 test servers, and have set up vsftpd to run as a service via xinetd instead of gssftp. This has solved the problem. Connecting by FTP is now fast all the time. So I just need to roll it out to the live machines now.
There's a noticeable difference in the messages that appear when I initiate the ftp connection from the client. Here's what I get when gssftp is running as the server:
# ftp fred
Connected to fred.bongo.co.uk.
220 fred.bongo.co.uk FTP server (Version 5.60) ready.
334 Using authentication type GSSAPI; ADAT must follow
GSSAPI accepted as authentication type
GSSAPI error major: Miscellaneous failure
GSSAPI error minor: No credentials cache found
GSSAPI error: initializing context
GSSAPI authentication failed
334 Using authentication type KERBEROS_V4; ADAT must follow
KERBEROS_V4 accepted as authentication type
Kerberos V4 krb_mk_req failed: You have no tickets cached
Name (fred:root):
And here's what I get when vsftpd is running as the server:
# ftp dolores
Connected to dolores.bongo.co.uk.
220 (vsFTPd 2.0.1)
530 Please login with USER and PASS.
530 Please login with USER and PASS.
KERBEROS_V4 rejected as an authentication type
Name (dolores:root):
So hopefully that's it. The step that led me to discover kerberos being involved was using tcpdump and wireshark to notice the packets being sent to a non-existent kerberos server throughout the delay period. Many thanks to wfh who provided the tcpdump suggestion that lead to the breakthrough!
Ooops, I didn't mean to say packets where being sent to a non-existent kerberos server. I should have said packets were being sent to the primary DNS server and the contents of the packets quoted a non-existent DNS server.
Connecting by FTP is now fast all the time. So I just need to roll it out to the live machines now.
...
So hopefully that's it. The step that led me to discover kerberos being involved was using tcpdump and wireshark to notice the packets being sent to a non-existent kerberos server throughout the delay period. Many thanks to wfh who provided the tcpdump suggestion that lead to the breakthrough!
Still, it was *YOU* who ran with the suggestion! Good detective work. Good analysis of the results!
Just simply comment all entries in /etc/resolve.conf
and try again, Hope it will work.
i got same problem. After removing reslove.conf entries, it works fine.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.