LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 04-27-2012, 09:00 AM   #1
wpns
LQ Newbie
 
Registered: Nov 2010
Posts: 7

Rep: Reputation: 0
Disk I/O cache tuning? Shuffling 5TB of small files around, machine keeps stalling.


So I'm trying to rearrange a large number of small files (webcam frames).

Each file is around 200KBytes
I've filled a 5TB array, so maybe 25M files, plus or minus.

I'm running rsync to move the files from one directory to another on the same array.

rsync --archive --remove-sent-files --progress --human-readable --verbose /mnt/Jupiter/[...]/ /mnt/Jupiter/[...]

Every 15-30 seconds the copy process (and any other I/O to the files on that array) stalls out for 45 seconds, as the OS (flushes it's disk caches?).

Is there a way to optimize this process, or am I just stuck with lots of file I/O? I'm thinking it's filling the disk cache with files it doesn't need to, and there ought to be a way to tune that...

Or is there a better way to move the files? mv doesn't work because there are too many files in a directory...

Thanks!
 
Old 04-27-2012, 09:27 AM   #2
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,659
Blog Entries: 4

Rep: Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940Reputation: 3940
You're, ummmm, "moving terabytes of files around," and "you've filled the array." Any computerized process slows down greatly when it starts to run out of storage-space. Don't assume that the issue is "flushing its disk caches." Instead, find out what it is.

You need more space. Double that allocation and spread-out the files among them. You probably also need a better algorithm than rsync if you know that the files aren't usually changing; or, specify that rsync can respect file-sizes and time-stamps to check for changes. Maintain a separate catalog database somewhere (SQLite?) to tell you what needs to be copied. This isn't a "generic" activity: you know an awful lot about it, and you will benefit by exerting that application-specific (human) knowledge and familiarity, in your fairly-customized solution of it.

It is very helpful to use the nice command to push-down the execution priority of that process, which you know to be "absolutely I/O-bound" anyhow. This will reduce somewhat the impact of this process on other activities. It'll spend nearly all of its time waiting for I/O, and, when it does get ready to execute again, it can well afford to be the low man on the totem-pole.

Last edited by sundialsvcs; 04-27-2012 at 09:28 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Image viewer that reads stdin and will not write cache files to disk H_TeXMeX_H Linux - Software 9 02-10-2012 01:42 PM
[SOLVED] Why is JFS faster than a raw device for 1.5TB disk? Daemo Linux - General 6 08-21-2010 09:16 PM
How to partition 1.5TB disk? kebabbert Solaris / OpenSolaris 4 10-28-2009 03:40 AM
Performance of single server with 5TB disk space (5 disks): how bad will it be? rs1050 Linux - Server 3 11-26-2008 11:31 PM
Partial/Corrupt Files in Disk Cache Crito Linux - Software 0 03-11-2004 02:13 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 12:51 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration