copying a huge amount of small files, on a live system
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
copying a huge amount of small files, on a live system
hi,
i have the following scenario:
<start>
one live fileserver, wich serves millions of small jpg files (photos) with lighty for a big website.
like 400GB of .jpg files in different folders.
and one backup server with a 500gb hdd in the same network with a mounted nfs share from the live server.
i have to copy the whole content from live to backup, without the possibility to take the live server offline.
</start>
When i start rsync on the backup, it takes ages til it starts to copy, and the cpu load is 100%, site will go offline.
When is start cp /live /backup its the same - offline.
the best would be to copy it immediately but only in like 20 files parts.
i would certainly go with that suggestion...i do this myself what i run is:-
nice -n 20 rsync .....
ps- priorities are -19 highest priority tooo ---> 20 lowest priority. You must be root user to increase (higer priority) but any user can lower priority for a command.
Just a note, there isn't really anything wrong with a high cpu load. If the nice value of a process is 20 and it is taking up 100% of the cpu then that just means that nothing else important is needing the processor. Instead of worrying about the cpu load with the copy at nice=20 what is the actual response like from the system?
HTH
Forrest
p.s. What is putting the files on the server? Can it also put them in the correct backup location?
i allready gave nice a try, but the cpu does still raise up to 100%...
well, the server always has lot of work to do...
i think it would be the best to copy parts of the files...but how can i do that?
regards
The bottleneck is probably not the CPU. It is more likely the amount of disk I/O or the disk cache being flooded by rsynch or cp. You might be able to reduce the amount of I/O by only copying files that have changed. Use the same /backup directory or directories over and over again. Then use rsynch -u to only copy the files that have changed. Similarly if you use:
cp -u /live /backup
then cp will only copy the files that have changed since the last backup.
This is more of a question really, but would it be better to tar up the files before you copy it across the network. It seems to me that would save a lot of requests from cp through nfs, which might be the bottleneck. As I said, I'm not sure. I suppose you'd also need a way to create the backup archive (maybe a cron job on the server?).
Comments appreciated.
i could give tar a try, but i dont have enough space left and i think, it will make the server unavailable till its finished?
Jailbait: you are right, the I/O wait keep the server down, not the CPU load!
Forrest: Users do that (so apache with php), its a big community site, where ppl can upload images - and they do it a lot!
I need a "backup basis" first, with the up to date state.
then it will be not a problem to do a copy -u /live /backup since it will only take a few minutes.
how about a for or a while loop with a file limit? is that possible?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.