LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 10-01-2008, 07:43 AM   #1
Sheridan
Member
 
Registered: Aug 2007
Location: Hungary
Distribution: Fedora, CentOS
Posts: 89

Rep: Reputation: 21
Question Effective FTP client for mass-download for Linux?


Hello folks,

I hope you can help me - I was looking for a solution for a long time, but anything I found didn't do me too much good...

I have to transfer a _lot_ of files of various size from a remote server on the other side of the world (literally) which has not so big bandwidth, unrelyable connection, etc. and the only possible method of file transfer is passive mode FTP (not possible to use anything else).

The server's root FTP folder stores about 14000 subdirectories, each of which contains lots of small files, some big ones, and there're some, which cannot be downloaded at all (permission issues).

The owner of the server provided access to our company trough FTP only and is unwilling/unable to provide a more useful access.

I need an FTP client which can reach at least the following goals:

1) Not bothered by unreliable connection, stupid server (drops connection or logs you out seemingly randomly), frozen transfers, etc etc.
2) Able to download several (10+) files in parallel (effectiveness)
3) Handles passive mode
4) Can "auto-skip" files with screwed-up names (has "?" in filename, etc), ignores permission errors, etc

Can be either command line or graphic.

What would you suggest (I tried wget, mc and ftp cli so far, but was very disappointing experience)?

Any comment is greatly appreciated.

Levente
 
Old 10-01-2008, 08:39 AM   #2
indienick
Senior Member
 
Registered: Dec 2005
Location: London, ON, Canada
Distribution: Arch, Ubuntu, Slackware, OpenBSD, FreeBSD
Posts: 1,853

Rep: Reputation: 65
Well, I use GFTP when I'm in an X session, as for when I build a TGZ package from a Slackbuild, I back it up onto the FTP server on my website provider. I usually do this once every two weeks, or so, and GFTP prompts me to overwrite or skip the transfer of a file if it exists on the remote location (or local location, depending on the direction of transfer).

Thankfully, it just prompts the overwrite/skip options with a list of the files in question, instead of one-by-one interactive prompting.

If you're looking for something from the command line, I use (almost exclusively) LFTP, but I have never messed around with situations where "if the file exists on the far end, and it is of the same size and modification date, skip it".

I don't know about "stupid server" allowances though, and automatically re-connecting lost connections with either client, though.
 
Old 10-01-2008, 08:57 AM   #3
ilikejam
Senior Member
 
Registered: Aug 2003
Location: Glasgow
Distribution: Fedora / Solaris
Posts: 3,109

Rep: Reputation: 96
Hi.

wget does everything you need except for parallel downloads. It should retry after lost connections up to 20 times, it escapes weird characters in filenames (so it'll download the files, but you shouldn't get any breakage from illegal characters in the filename), and it'll do passive FTP.

Dave
 
Old 10-01-2008, 08:57 AM   #4
theYinYeti
Senior Member
 
Registered: Jul 2004
Location: France
Distribution: Arch Linux
Posts: 1,897

Rep: Reputation: 61
I would suggest “lftp”.

Yves.
 
Old 10-01-2008, 10:55 AM   #5
i92guboj
Gentoo support team
 
Registered: May 2008
Location: Lucena, Córdoba (Spain)
Distribution: Gentoo
Posts: 4,036

Rep: Reputation: 373Reputation: 373Reputation: 373Reputation: 373
Quote:
Originally Posted by Sheridan View Post
1) Not bothered by unreliable connection, stupid server (drops connection or logs you out seemingly randomly), frozen transfers, etc etc.
Quote:
Originally Posted by wget man page
Wget has been designed for robustness over slow or unstable network connec‐
tions; if a download fails due to a network problem, it will keep retrying
until the whole file has been retrieved. If the server supports regetting,
it will instruct the server to continue the download from where it left off.
Quote:
2) Able to download several (10+) files in parallel (effectiveness)
This one is not supported. However, to write a wrapper for such purpose should be trivial enough even using shell scripting. There's a simple way around this as well: put all the download links on a text file, then split it on 4 files (for example, any number will do). Then use wget -i four times on four xterms, once for each file. You will get 4 concurrent wgets working and each one of them will download a separate list of urls. Simple, clean, efficient.

Quote:
3) Handles passive mode
The hard thing is to find an ftp/dl client that doesn't do it. Of course, wget does by default.

Quote:
4) Can "auto-skip" files with screwed-up names (has "?" in filename, etc), ignores permission errors, etc
Wget can use -i to read urls from a file, if a given file can't be retrieved due to an error, the following will be downloaded instead. You can use --tries=number to retry a given number of times, or 0 for infinite. The default is 20 times. Even if you use 0, it will still skip if the error is critical, so you shouldn't have any problem at all about that.

Quote:
What would you suggest (I tried wget, mc and ftp cli so far, but was very disappointing experience)?
Explain why. If the server is completely screwed ALL the clients will deceive you, it doesn't mind how good and/or complete they are.

On the other side, there's axel, though I found it to be a bit unstable on certain circumstances (but it can do threaded downloads).

If you prefer something graphical, there's kget and d4x, I have no idea how solid or good they are. I only use graphical tools when I have no option, or when the command line counterpart is insanely complicated, which is not the case.

Last edited by i92guboj; 10-01-2008 at 10:57 AM.
 
Old 10-01-2008, 08:10 PM   #6
jlinkels
Senior Member
 
Registered: Oct 2003
Location: Bonaire
Distribution: Debian Lenny/Squeeze/Wheezy/Sid
Posts: 4,065

Rep: Reputation: 491Reputation: 491Reputation: 491Reputation: 491Reputation: 491
wget was the one first coming to my mind as well.

Are you *sure* you need parallel downloads? Parallel downloads is mostly relevant if you have a server which limits the bandwidth per connection. However, I have the idea that you are dealing with a server which has low bandwidth anyway.

If so, if your client uses parallel downloads, you'll share the available bandwidth over your downloads slowing down each individual download, the sum remaining equal. This is not true of course if you have to share this bandwidth with others, then it pays if you have say, 4 parallel downloads and the other client has only 1 download. Until he uses a download manager as well of course creating multiple streams.

jlinkels
 
Old 10-01-2008, 11:21 PM   #7
i92guboj
Gentoo support team
 
Registered: May 2008
Location: Lucena, Córdoba (Spain)
Distribution: Gentoo
Posts: 4,036

Rep: Reputation: 373Reputation: 373Reputation: 373Reputation: 373
Quote:
Originally Posted by jlinkels View Post
wget was the one first coming to my mind as well.

Are you *sure* you need parallel downloads? Parallel downloads is mostly relevant if you have a server which limits the bandwidth per connection. However, I have the idea that you are dealing with a server which has low bandwidth anyway.

If so, if your client uses parallel downloads, you'll share the available bandwidth over your downloads slowing down each individual download, the sum remaining equal. This is not true of course if you have to share this bandwidth with others, then it pays if you have say, 4 parallel downloads and the other client has only 1 download. Until he uses a download manager as well of course creating multiple streams.

jlinkels
Yes. And even more, more threads is more server load, and less bandwidth for thread can make it even worse, becase connections will have a much bigger chance to fail starved. As I said above, if it's a crappy server, you are not going to fix it with a download manager, unless it's a very specific issue like the one you describe (limited bandwidth per connection).

Wget and curl are very solid programs. You can find nicer ones, but hardly better ones for that task.
 
Old 10-02-2008, 04:49 AM   #8
Sheridan
Member
 
Registered: Aug 2007
Location: Hungary
Distribution: Fedora, CentOS
Posts: 89

Original Poster
Rep: Reputation: 21
Dear Folks,

Thank you so much for all the replies. You gave me some stuff to test, and for that I'm very grateful. I'll make sure to tell which has worked out best, but for now I think lftp will be the winner...

Anyway...

Some of you asked why I had problems with wget. Well maybe I'm just green, but here's the problem:

I used the following syntax to mirror the directories onto the local server:

Code:
wget --mirror ftp://ftp.somesite.net/bigdirectory -o /home/me/xferlog
The result is that in the current directory I get some empty folders, but no files, and wget exits after a few passes. No apparent errors.

Levente
 
  


Reply

Tags
client, ftp, linux


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to mass-download from a site? brian0918 General 7 09-10-2007 10:52 PM
effective ftp kshkid Linux - Networking 5 06-01-2006 01:56 AM
file up/download php client on linux server sam99 Programming 5 03-10-2004 05:06 PM
Linux FTP/HTTP Download MD5SUM howzmusic Linux - General 2 05-29-2002 01:18 PM
what ftp to connect to to download linux? Scorcher2005 Linux - Distributions 2 04-17-2002 10:01 AM


All times are GMT -5. The time now is 06:46 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration