LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-05-2010, 09:41 AM   #1
tincboy
Member
 
Registered: Apr 2010
Posts: 36

Rep: Reputation: 0
copy sparse files


I've a big sparse file ( about 100 GB that only 1 GB of it is used and it's .raw file )
I want to copy this file faster than normal copy with cp command,
Any one familiar with this concept?
 
Old 07-05-2010, 09:50 AM   #2
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
My first reaction: Why would you want or need such a file?

Regardless, I'd assume that any action to compress the file before copying would take more time than simply copying it. But, if you need to copy it many times, then just compress it.

You could also try "dd", but I have no idea if it would be faster. Maybe try it on a smaller file.
 
Old 07-05-2010, 09:51 AM   #3
onebuck
Moderator
 
Registered: Jan 2005
Location: Central Florida 20 minutes from Disney World
Distribution: SlackwareŽ
Posts: 13,923
Blog Entries: 44

Rep: Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158
Hi,

Quote:
Originally Posted by tincboy View Post
I've a big sparse file ( about 100 GB that only 1 GB of it is used and it's .raw file )
I want to copy this file faster than normal copy with cp command,
Any one familiar with this concept?
Look at 'sparse file:copying'.
 
Old 07-05-2010, 10:04 AM   #4
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,256

Rep: Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321
Quote:
Originally Posted by tincboy View Post
I've a big sparse file ( about 100 GB that only 1 GB of it is used and it's .raw file )
I want to copy this file faster than normal copy with cp command,
Any one familiar with this concept?
Yes, the concept is called impatience ;-).
Do you need the bloat - the extra 99 Gig? If so, and you have another 30 or 40 gig for a temporary file, why not 'gzip sparse_file'. If you don't need the crap, give details on the stuff you want, and the stuff you want to leave behind.
 
Old 07-05-2010, 10:26 AM   #5
vikas027
Senior Member
 
Registered: May 2007
Location: Sydney
Distribution: RHEL, CentOS, Ubuntu, Debian, OS X
Posts: 1,305

Rep: Reputation: 107Reputation: 107
use bzip2 -9 filename. This has maximum compression.

Although, it may take considerable time compressing/uncompressing the file.

I would suggest you to bzip2 it, if in case you need to copy it again and again.
 
Old 07-05-2010, 11:42 AM   #6
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
xz takes very, very, very long to compress, but it produces much smaller files and, ironically, decompresses really fast!
 
Old 07-06-2010, 12:51 AM   #7
tincboy
Member
 
Registered: Apr 2010
Posts: 36

Original Poster
Rep: Reputation: 0
I've many of these files, and it's every day job of mine,
the most important factor for me is time,
I want to do it faster than cp command
 
Old 07-06-2010, 02:10 AM   #8
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,120

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
You probably can't.
I just ran some tests, and "cp" appears to recognise sparse (input) files o.k. However strace shows it issuing a read every 32k. There is also a corresponding seek on the output fd.
All that takes time, even if the file is completely empty (as in my test).

Update: got me wondering now - how much benefit is there in that. A file of the same size full of random data issues the same number of reads, and issues writes in place of seeks. Takes much longer of course, but if a sparse file is (actually) zero bytes, why all the reads ...

I see if I can chase this up tomorrow.

Last edited by syg00; 07-06-2010 at 02:41 AM.
 
Old 07-06-2010, 02:42 AM   #9
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,256

Rep: Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321Reputation: 2321
Just for the post, try a race. I would suggest gzip -1, as you are not particularly pressed for space. You could also do a cron job to have the zipping done when you are at home :-D
 
Old 07-06-2010, 03:17 AM   #10
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
rsync's --sparse option makes it "handle sparse files efficiently" (so says the man page).
 
Old 07-06-2010, 04:05 AM   #11
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,120

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
I was thinking about this on the ride home. Looking at the manpage confirms that "cp" is only looking to see if a sparse output allocation is required.
I'll check rsync tomorrow.
 
Old 07-06-2010, 06:42 AM   #12
onebuck
Moderator
 
Registered: Jan 2005
Location: Central Florida 20 minutes from Disney World
Distribution: SlackwareŽ
Posts: 13,923
Blog Entries: 44

Rep: Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158Reputation: 3158
Hi,

Quote:
excerpt from 'sparse file:copying';cp --sparse=always formerly-sparse-file recovered-sparse-file
It should be noted that some cp implementations do not support the --sparse option and will always expand sparse files, like FreeBSD's cp. A viable alternative on those systems is to use rsync with its own --sparse option[3] instead of cp.
'rsync --sparse' is viable alternative to the 'cp --sparse=always formerly-sparse-file recovered-sparse-file'.
 
Old 07-06-2010, 07:40 PM   #13
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,120

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
O.K., some more testing showed the above "cost" for cp is all set-up of a new file. Repeated copies into the (pre-allocated) destination file showed minimal reads and writes.
Far better than rsync (-b -S) in fact. Both cp and rsync created a sparse output, but rsync continued to read and write the entire file when only a couple of sectors out of 1Gig had non-zero data. cp was much more efficient.
Similar results for 5 Meg input.

My test, my data, my machine, YMMV, <blah>, <blah>, <blah> ...
 
Old 07-07-2010, 01:24 AM   #14
tincboy
Member
 
Registered: Apr 2010
Posts: 36

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by syg00 View Post
O.K., some more testing showed the above "cost" for cp is all set-up of a new file. Repeated copies into the (pre-allocated) destination file showed minimal reads and writes.
Far better than rsync (-b -S) in fact. Both cp and rsync created a sparse output, but rsync continued to read and write the entire file when only a couple of sectors out of 1Gig had non-zero data. cp was much more efficient.
Similar results for 5 Meg input.

My test, my data, my machine, YMMV, <blah>, <blah>, <blah> ...
So do you think normal use of cp is the best choice?
 
Old 07-07-2010, 01:47 AM   #15
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,120

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Yes - especially if you can re-use the output files each day (after the first obviously). That is, don't delete the (output) files each day, over-write them.
The "-b" on the rsync was *bad* - but even with "-t" (or -a), "cp" was still marginally faster. Which surprised me I must admit.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Creating a script to move or copy files into multiple directories below the files matthes138 Linux - Newbie 5 08-25-2009 04:57 PM
in copy files or ls files the command want to invert select some files how to?? hocheetiong Linux - Newbie 3 06-27-2008 06:32 AM
MLDonkey/mlnet: creating sparse files - undesirable alexander_bosakov Linux - Software 0 02-26-2008 04:10 PM
Command to copy files/folders but skip bad/corrupt files?? leemoreau Linux - Newbie 2 04-02-2007 02:27 PM
reserving space on the disk for sparse files madhukirant Programming 1 08-17-2005 07:29 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 06:49 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration