LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices



Reply
 
Search this Thread
Old 01-17-2012, 09:59 AM   #1
borepstein
LQ Newbie
 
Registered: Jun 2010
Location: Boston, MA, USA
Distribution: OpenSUSE, Ubuntu, Centos
Posts: 9

Rep: Reputation: 2
initializing a large file


Hello fellow linuxoids,

I am trying to initialize a large file (5TB) to be used for a file system to be used for an off-site backup. Be that as it may - the task is to create a large file, its content irrelevant.

I did not know any better than to use the following:

dd if=/dev/zero of=<big_file> bs=1M count=5242800

This seems to be doing the job - but it proceeds at the meager speed of roughly 5.5 MB/s, has been at it for 4 days and thus far the file is only 1.7 TB. So my question is: is there a way to accomplish the same thing faster?

Thanks.

Boris.
 
Old 01-17-2012, 10:16 AM   #2
Rebellion
LQ Newbie
 
Registered: Aug 2008
Posts: 1

Rep: Reputation: 0
It depends on many factors. What speed is your hard drive? Where are you creating this file, locally?. What is the filesystem?
Also, you may find dd's speed can be affected by the bs (block size) setting.
Also having a faster computer helps


Quote:
Originally Posted by borepstein View Post
Hello fellow linuxoids,

I am trying to initialize a large file (5TB) to be used for a file system to be used for an off-site backup. Be that as it may - the task is to create a large file, its content irrelevant.

I did not know any better than to use the following:

dd if=/dev/zero of=<big_file> bs=1M count=5242800

This seems to be doing the job - but it proceeds at the meager speed of roughly 5.5 MB/s, has been at it for 4 days and thus far the file is only 1.7 TB. So my question is: is there a way to accomplish the same thing faster?

Thanks.

Boris.
 
Old 01-17-2012, 11:19 AM   #3
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,563
Blog Entries: 29

Rep: Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179
Here's a reportedly very fast solution using Java (from http://www.velocityreviews.com/forum...mpty-file.html)
Code:
RandomAccessFile bo2 = new
RandomAccessFile(PFSSettingConstants.filename, "rw");
bo2.seek(1024*1024*1024);
bo2.write(0);
 
Old 01-17-2012, 11:27 AM   #4
sundialsvcs
Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 5,455

Rep: Reputation: 1172Reputation: 1172Reputation: 1172Reputation: 1172Reputation: 1172Reputation: 1172Reputation: 1172Reputation: 1172Reputation: 1172
I suggest that you should break the file into multiple files of more reasonable size. What you are contemplating is a difficult "edge case" for any file system to deal with. You should design your strategy so that this edge-case is avoided.

Last edited by sundialsvcs; 01-17-2012 at 11:28 AM.
 
Old 01-18-2012, 01:19 PM   #5
borepstein
LQ Newbie
 
Registered: Jun 2010
Location: Boston, MA, USA
Distribution: OpenSUSE, Ubuntu, Centos
Posts: 9

Original Poster
Rep: Reputation: 2
thanks, it is very fast indeed:)

Quote:
Originally Posted by catkin View Post
Here's a reportedly very fast solution using Java (from http://www.velocityreviews.com/forum...mpty-file.html)
Code:
RandomAccessFile bo2 = new
RandomAccessFile(PFSSettingConstants.filename, "rw");
bo2.seek(1024*1024*1024);
bo2.write(0);
I wrote a little java program that does exactly what's detailed in the recommendations. Took me 4.6 seconds to initialize a 5TB file Strangely, it appears to have only altered the registry (inode) data in the file system as the df data on the filesystem has not changed.

Boris.
 
Old 01-18-2012, 03:14 PM   #6
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 943Reputation: 943Reputation: 943Reputation: 943Reputation: 943Reputation: 943Reputation: 943Reputation: 943
Quote:
Originally Posted by catkin View Post
Code:
RandomAccessFile bo2 = new
RandomAccessFile(PFSSettingConstants.filename, "rw");
bo2.seek(1024*1024*1024);
bo2.write(0);
This creates a sparse file, specifically one with no data.

You can do the same using dd:
Code:
dd if=/dev/zero of=filename bs=1 count=1 seek=size-in-bytes-less-one
On my own machine, creating a five-terabyte sparse file takes all of two milliseconds,
Code:
time dd if=/dev/zero of=filename bs=1 count=1 seek=$[5*1024*1024*1024*1024-1]
1+0 records in
1+0 records out
1 byte (1 B) copied, 2,4395e-05 s, 41,0 kB/s
real	0m0.002s
user	0m0.000s
sys	0m0.000s

ls -l filename
-rw-rw-r-- 1 user group 5497558138880 2012-01-18 21:47 filename
Since some filesystems do compact zeros, I like to create dummy files using a large file of random data, then cat'ing enough copies with a few random bytes mixed in:
Code:
rm -f very-big-file

size=$[5*1024*1024*1024*1024]

dd if=/dev/urandom of=chunk count=1 bs=1039799
have=1039817
while [ $have -lt $size ]; do
    cat chunk >> very-big-file
    dd if=/dev/urandom of=very-big-file count=1 bs=18 oflag=append conv=notrunc
    printf '\r%lu of %lu MiB ' $[have/1048576] $[size/1048576] >&2
    have=$[have+1039817]
done
if [ $have -lt $size ]; do
    dd if=chunk of=very-big-file count=1 bs=$[size-have] oflag=append conv=notrunc
fi
rm -f chunk
echo >&2
Do note that that file does need full five terabytes (plus a megabyte or so for the temporary random chunk). It should take about ten hours using a three-disk RAID-0 array (a conservatively sustainable 150 megabytes per second).

If the storage device has actual use at the same time, I recommend ionice'ing it to a lower priority. Run
Code:
exec ionice -c 2 -n 7 bash
or
Code:
exec ionice -c 3 -n 0 bash
to reduce the I/O priority for the stuff you run in that shell to lowest priority best-effort, or important-but-only-when-idle, respectively, before using something like the above to create the random-stuff file.

Last edited by Nominal Animal; 01-18-2012 at 03:16 PM.
 
Old 01-18-2012, 09:50 PM   #7
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,563
Blog Entries: 29

Rep: Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179
Quote:
Originally Posted by Nominal Animal View Post
This creates a sparse file, specifically one with no data.
Thanks for pointing it out Nominal Animal
 
  


Reply

Tags
file operators


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
declaration a variable in one file and initializing in another C program jamesbon Programming 5 11-25-2010 08:06 AM
[quick] trying to split a large file but linux says it's to large steve51184 Linux - General 16 05-06-2008 08:40 AM
LXer: This week at LWN: Large pages, large blocks, and large problems LXer Syndicated Linux News 0 09-27-2007 12:40 PM
File too large (script is too large to execute) DeuceNegative Linux - General 1 05-09-2007 01:10 AM


All times are GMT -5. The time now is 04:42 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration