LinuxQuestions.org
Did you know LQ has a Linux Hardware Compatibility List?
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices



Reply
 
Search this Thread
Old 10-26-2009, 10:38 AM   #1
ghoulsblade
LQ Newbie
 
Registered: Oct 2009
Posts: 6

Rep: Reputation: 0
ftp : need to transfer 2.5terrabyte(huge number of files), good ftp program ?


Hi all,

i need to transfer some massive amount of data (2.5terrabyte, many files, directory structure) to a embedded raid-box which has a minimal linux on it (some custom distro from western digital).

we tried rsync (version 2.6.7), but it crashes because the filelist is too big for the ram available (fixed in later versions of rsync, but i don't know how to update, it's not debian based and there are no compiler tools)

we tried nfs, but the max bandwidth produced is around 1 mb/sec (cpu bound?), so it'd take around 3 weeks this way

samba has problems with big files (and we have some 20gb files in there)

scp isn't installed, and would probably also be cpu bound due to encryption i think.

so the only option left would be ftp, we're currently trying using ncftp
with the command "put -R /path/to/data/" , but it's been running for over an hour, eating up most of the ram, and not using any bandwidth. i think it is still building a file list or something.
FTP already worked for a single 20gb file with acceptable bandwidth of about 12mb/sec.

Does anyone know a better ftp program (for console) that can start transferring some data or at least display an estimated time for the
copy-preperation ?
 
Old 10-26-2009, 10:52 AM   #2
indienick
Senior Member
 
Registered: Dec 2005
Location: London, ON, Canada
Distribution: Arch, Ubuntu, Slackware, OpenBSD, FreeBSD
Posts: 1,853

Rep: Reputation: 65
I have always been a fan of lftp.
If you are doing the transfer from an OpenBSD box, the OpenBSD-provided ftp client is brilliant as well.
 
Old 10-26-2009, 10:54 AM   #3
MS3FGX
Guru
 
Registered: Jan 2004
Location: NJ, USA
Distribution: Slackware, Debian
Posts: 5,852

Rep: Reputation: 351Reputation: 351Reputation: 351Reputation: 351
This sounds like it will be terrible pretty much any way you do it. Transferring that much data to an embedded box like that is always going to be painful.

Have you looked into perhaps pulling the drives from the device and putting the data directly onto them? I don't know if a box like that would be doing hardware or software RAID, but either way it seems like it would be a lot easier and quicker do duplicate the configuration and put the data on it directly than pushing it over the network.

If that fails, perhaps you would be better off concatenating the files into single large files with tar, and then pulling them back apart on the receiving end. Transferring 2 dozen 100 GB .tar files doesn't sound like much fun either, but at least then you wouldn't have to deal with so many individual files being transferred.
 
Old 10-26-2009, 03:14 PM   #4
indienick
Senior Member
 
Registered: Dec 2005
Location: London, ON, Canada
Distribution: Arch, Ubuntu, Slackware, OpenBSD, FreeBSD
Posts: 1,853

Rep: Reputation: 65
Oooh - reading your response, MS3FGX, made me think of something I used to do when I had to transfer mass amounts of data over a network. I devised this because the network I was on was dying; the routers took a power surge from a lightning strike and were acting erratically.

Put all of the files you need into one, giant tar archive. If you want to use compression here, that's up to you, but I really don't recommend it. Use "split" to break up this massive tarchive into many, smaller chunks. Send them over the network by your preferred method (this now also re-opens up NFS and Samba possibilities, but I would still use FTP). Once received, rebuild the massive tarchive (using "cat") and un-tar it!

I say to avoid using compression, simply because you are transferring to an embedded box and I am under the impression that you just don't have the processing power to do a massive (2.5 TB) archive.

Read the manpages for split(1) and tar(1) to figure out how to use them best for your situation.

Last edited by indienick; 10-26-2009 at 03:17 PM.
 
Old 10-27-2009, 04:15 AM   #5
ghoulsblade
LQ Newbie
 
Registered: Oct 2009
Posts: 6

Original Poster
Rep: Reputation: 0
thanks for the replies =)
we ended up making the 2.5t available via ftp-server where it is, and starting a ftp-download using wget -r from the raidbox, which started the transfer immediately and gets good bandwidth, so it should be finished in around 3 days.
What we're also experimenting with is crosscompiling a newer rsync version for the architecture the raidbox has (arm9 i think), but not sure if that'll work out.
One problem with tar would be that there is not enough space left where the 2.5tb are currently, might still be possible if attached files can be deleted on the fly, but feels risky.

Last edited by ghoulsblade; 10-27-2009 at 04:16 AM.
 
Old 10-27-2009, 04:21 AM   #6
jschiwal
Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 655Reputation: 655Reputation: 655Reputation: 655Reputation: 655Reputation: 655
Your nfs bandwidth sounds very off. You might want to look into what the problem is there because 1 mbs is obsurd for NFS. It should be the fastest option over the network.
 
Old 10-27-2009, 07:31 AM   #7
ghoulsblade
LQ Newbie
 
Registered: Oct 2009
Posts: 6

Original Poster
Rep: Reputation: 0
My guess regarding nfs is that the due to the large number of files, the small cpu of the raidbox is the bottleneck.
There are tons of very small files as well, not just big data files.
It's probably possible to tweak the nfs parameters so that the bandwidth becomes acceptable, but that would require in-depth knowledge.

The wget thing didn't work out. it finished after copying around 520gb saying it is complete, but that is only a fraction of the 2.5tb data.

We now started midnight commander (mc) on the server, connected to the raidbox via ftp, and started the upload using the copy-command. It began transfer immediately and is now running with around 7-10 mb/sec. we'll see if that works out, looks good so far =)
 
Old 10-27-2009, 12:33 PM   #8
indienick
Senior Member
 
Registered: Dec 2005
Location: London, ON, Canada
Distribution: Arch, Ubuntu, Slackware, OpenBSD, FreeBSD
Posts: 1,853

Rep: Reputation: 65
Does your embedded Arm9 box have ssh installed, by any chance?

If so, I would suggest something like sshfs (a FUSE module) that lets you mount remote directories from SSH credentials and treat them like a local filesystem! I use it all the time at work, and at home to connect to my work computer.
 
Old 10-29-2009, 04:59 AM   #9
ghoulsblade
LQ Newbie
 
Registered: Oct 2009
Posts: 6

Original Poster
Rep: Reputation: 0
hrm, ftp slowed down to a crawl as well when it hit many-small files -_-

maybe we'll try mounting the raid via ftpfs or nfs , and run rsync locally on the server where the data is.

ssh : yes, but not scp. we also tried scp from remote, but couldn't get it to work. will sshfs still work ?
edit : just tried, and i get "remote host has disconnected" as error message

Last edited by ghoulsblade; 10-29-2009 at 05:06 AM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Transfer last generated files using ftp sarajevo Linux - Server 2 05-24-2008 01:49 AM
Good FTP transfer program? polarbear20000 Linux - Software 7 05-31-2007 03:12 PM
using wget to transfer files ober ftp hkhiroya Debian 1 05-20-2006 05:23 PM
transfer files over the web, ftp, http or other? buffed317 Linux - General 7 03-08-2005 10:25 PM
Good FTP program JreL Linux - Software 5 09-05-2003 08:45 AM


All times are GMT -5. The time now is 07:24 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration