Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I'm having issues trying to download large files from a remote server. The downloads start hanging when they get to around 220K.
I had a Perl script with which I was using Net::FTP (and which works on Windows), and figured there was some problem with Net::FTP, but I no longer think that is the issue, as I also tried FTP from the command line, and I get the same problem.
I then tried using Curl, with its FTP option, to download one of the files, but same thing - it hangs at around 220K. There is no problem with any method downloading files smaller than this. (e.g. 150K files download with no problem at all, using any method).
Just for laughs I also set SELinux to permissive mode for the current session, (clutching at straws), but no difference. FYI, Perl is compiled with the large-file flag (can't recall the flag name off-hand, but I checked, and it is defined).
So, I'm left with guessing that maybe there is either some FTP restraint somewhere that I don't know of, or there is a system restraint of some sort. I am baffled!
Any help sincerely appreciated, as I seem to have exhausted all the things I can think of on my own.
Last edited by cheddarcheese; 08-20-2011 at 08:09 PM.
Can you download a file from another server (say a distribution ISO) to confirm it's either server-specific or a generic issue? When you do please run the command through strace so we get to see the last ten or so lines that may include clues?
Okay, have provided a top,middle and tail of a trace below, from the curl command. I Ctrl-D'd it when the trace file started getting enormous (didn't take long), and when it appeared no longer to be doing anything. I also checked "ulimit -a" for file size limits, but it is set to unlimited. Also seems to be plenty of disk space. The only change I've made to the trace below is the site name/user-name/pwd:
Sure, I can do, but the effect is more like being in a loop than anything actually failing. That is, no matter how long I leave it, once the file gets to a certain size nothing will happen: i.e. whether using command-line FTP, FTP from a Perl script (using a Passive or Active FTP connection), or from Curl, it appears to just hang, no matter how long I leave it. Once it gets to around 220K, then doing an ls -l from a different terminal shows that the size of the file doesn't change.
I have a good connection, and the first 220K takes very little time at all, so I think it's reasonable to assume that the rest of the file might download in a similar fashion, but even waiting inordinately longer seems to make no difference. In the Perl script I had the fetch in an eval block, which timed out after seven minutes, but the file size didn't change at all after a certain point. However, on saying that, the file-size was never absolutely identical each time, but was within a very few K.
Will try and do an FTP download from a completely different server later on today if I can.
Last edited by cheddarcheese; 08-21-2011 at 06:31 AM.
Hmm. Interesting. You don't have any odd firewall rules loaded, right? Maybe start a packet capture before you try that site again, something like 'tcpdump -f -n -nn -N -p -ttt -vvv -i eth0 -w /path/to/file.pcap host data.somesite.com'. BTW if you're going to strace again using the "-v -s 10240 -o /path/to/logfile -etrace=!write" switches may provide more nfo and the !write should keep it from logging all writes. Might miss something interesting but OTOH it should keep your log from growing exponentially.
BTW LQ only accepts png, jpeg, jpg, gif, log, txt and pdf attachments so you best obfuscate and attach your strace log as plain text file and host or upload the pcap file, else contact me by email to discuss dropping it off if it's too big.
Hmm, I am curious now, but a bit deflated, because after trying to download some big distro files via FTP, and it working fine, I am left with thinking that the problem must be something to do with the FTP server from which I am downloading - but I can't understand why it works perfectly on the Windows box, with the same IP/router, etc, for the same server and the same files.
Just for interest, I left the Curl retrieval running (without tracing) to see how long it would take for something to happen. It eventually quit after about an hour+ with the following message:
curl: (56) Recv failure: Connection reset by peer
I can still do the tcpdump and strace again if you like, but would be interested, in the interim, if you can think of what might cause Linux and Windows to behave so differently.
Mind you, it comes to mind now, that I also had this same/similar problem (ages ago), with the same type of downloads (from the same place) with 64-bit Vista (the same box as I'm using now, before I Fedora-ized it), and to get FTP working for big files from this site I had to use FTP from a 32-bit XP box. I can't imagine why the 64-bit has anything to do with it, but maybe you or someone else knows much better than me.
Was doing some other research on that Curl error message, and came across something which mentioned tcp_window_scaling. However, /proc/sys/net/ipv4/tcp_window_scaling is already set to 1, which is apparently what I would want it to be. (if it's even relevant at all).
Thanks again for your time.
Last edited by cheddarcheese; 08-21-2011 at 09:16 AM.
Some good news ... While I would still really like to know the technical cause of the problem, I seem to have found a workaround, by using Curl with the "--limit-rate" option set to 10K.
I can probably fiddle with the 10K to find an optimum size, but I've now downloaded a couple of the huge files I was having a problem with (and continuing to try the rest). It's presumably going to be slower than it would have been for smaller files, but at least it will mean I can download everything I need to.
For anyone else that finds this post and wants to try Curl with the --limit-rate then below is how I used the command:
Thanks ever so much unSpawn for your suggestions - I learned several things courtesy of your ideas. (and if you do have any clue as to the actual technical reason then that would be really good to know).
Last edited by cheddarcheese; 08-21-2011 at 10:13 AM.
Apparently the FTP server (tried to restart / resume? and then) closed the connection and Curl didn't handle it well (if you check Red Hat / Fedora bug tracker you'll find some 2010 patches for libcurl to enable slash fix retrying connections). Actually using wget with a bandwidth limit would have been my next shot. Wrt the technical side of things: if you would have let tcpdump / wireshark run along your D/L you'll see FTP messages that might be interesting. Anyway, good to see you found a usable workaround yourself.