LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 12-11-2007, 03:03 AM   #1
edenCC
Member
 
Registered: May 2006
Location: Gz,China
Distribution: RH,FB
Posts: 196
Blog Entries: 1

Rep: Reputation: 32
howto copy & backup LARGE amount of data


Hi, List;
Personally, it's a funny question,
but we indeed need a high efficiency solution for these two questions:

1, how to copy several TeraBytes files among a LAN. There are totally million of files, and most of these files are less then 50KB, and some of them are as large as 50MB. We need to copy these files from one server to another.

2, how to build a reliable storge volume locally with several SCSI disks on a singal machine running linux without any raid cards. We need to store
several giga bytes files locally
 
Old 12-11-2007, 03:42 AM   #2
b0uncer
Guru
 
Registered: Aug 2003
Distribution: CentOS, OS X
Posts: 5,131

Rep: Reputation: Disabled
I'd go for rsync for the first one, but as I don't have to copy millions of files, I can't say how efficient it exactly is..but I guess it's no slower than the other obvious solutions either. And it has an advantage: if you stop the transfer, it won't re-transfer the already transferred files (it can check and copy only changed files). The overall speed depends on the network speed, I guess.

How about software RAID or LVM for the second? If you want reliable things, you're going to have to pay for a "professional" person to do that, because they're responsible for what they do (and thus usually try to do it good), any insurance firms probably require you to hire a "professional" to do it if it's important (and if you have an insurance company involved in your work, which most commercial firms have), and that way you'll just pay money and get the result, without worries. The other way is not to pay, do it yourself, have it cheap and hope that you know what you do - and hope that the solutions presented here actually work for your million-file storage machine, because there is no real guarantee
 
Old 12-11-2007, 03:54 AM   #3
dyasny
Member
 
Registered: Dec 2007
Location: Canada
Distribution: RHEL,Fedora
Posts: 847

Rep: Reputation: 91
I'd use dd or even cpio, because rsync copies files, and dd copies chunks of data on a lower level. not sure about recovering an interrupted copy though.
 
Old 12-12-2007, 02:17 AM   #4
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,311

Rep: Reputation: 2040Reputation: 2040Reputation: 2040Reputation: 2040Reputation: 2040Reputation: 2040Reputation: 2040Reputation: 2040Reputation: 2040Reputation: 2040Reputation: 2040
rsync does also offer compression on the fly, which would lower the load on the LAN. You could also give it a low run priority/nice value.
Thi sort of qn really depends on what you mean by efficient and whether you need a 'once only' copy, or just move the data once to another box or do repeated copies/backups.

Options include (but are not limited to):
1. to move the data once : txfr the the disk pack(s) to the new box.
2. one time copy: a) low priority rsync b) mirror the disks, then move mirrored set to the new box c) use a dedicated cable link.
3. repeated backups: use any of 2 a/b/c to a dedicated backup system, then backup to tape/DVD etc.

Actually, once you've got a backup copy, the fact that rsync does only differences by default means you should be able to use it at a higher priority for subsequent backups.
 
Old 12-16-2007, 09:44 AM   #5
edenCC
Member
 
Registered: May 2006
Location: Gz,China
Distribution: RH,FB
Posts: 196
Blog Entries: 1

Original Poster
Rep: Reputation: 32
Quote:
Originally Posted by dyasny View Post
I'd use dd or even cpio, because rsync copies files, and dd copies chunks of data on a lower level. not sure about recovering an interrupted copy though.
yes, dd is working at lower level, while rsync is high level.
so dd is useful for fully used disk volume, as it's much fast than rsync. As I noticed that, when using dd to backup data to NFS, the avg. speed is about 40MB/S, while rsync is about 14MB/s.
 
Old 12-16-2007, 10:02 AM   #6
edenCC
Member
 
Registered: May 2006
Location: Gz,China
Distribution: RH,FB
Posts: 196
Blog Entries: 1

Original Poster
Rep: Reputation: 32
Quote:
Originally Posted by b0uncer View Post
How about software RAID or LVM for the second? If you want reliable things, you're going to have to pay for a "professional" person to do that, because they're responsible for what they do (and thus usually try to do it good), any insurance firms probably require you to hire a "professional" to do it if it's important (and if you have an insurance company involved in your work, which most commercial firms have), and that way you'll just pay money and get the result, without worries. The other way is not to pay, do it yourself, have it cheap and hope that you know what you do - and hope that the solutions presented here actually work for your million-file storage machine, because there is no real guarantee
Yes, thanks, we are going to use soft raid or lvm
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Using wget to copy large amounts of data via ftp. AndrewCAtWayofthebit Linux - General 1 05-11-2006 12:55 PM
backup large data - good compression slackman Linux - General 12 04-28-2006 02:01 AM
generate large amount of traffic data Mr_C Linux - Networking 3 03-10-2006 12:38 AM
HP laserjet 6p stalls with large amount of data simjii Mandriva 0 11-10-2005 04:32 PM
DVD backup: howto copy subtitle streams?? psychotron1 Linux - General 6 01-24-2005 01:34 PM


All times are GMT -5. The time now is 01:46 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration