Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a directory that has over fifteen thousand small files in it. Tens or hundreds of files are created in this directory every second. I am attempting to create a tarball of this directory and am encountering an issue when doing so. Using the following command, I am able to successfully create a tarball of the directory when static:
When running this command against the directory when there are files actively being created in it, I only seem to be able to grab the last several hundred files, or so, that have been created (obvious when looking at file creation time and date when looking at the files in the extracted archive). When watching the tarballs size as it's being created (using "watch --interval=1 ls -al" in the archive directory), I see the archive file repeatedly grow and shrink, sometimes even zeroing out.
I'm sure this has something with the way that xargs is interpreting finds output given the files that are being created constantly, but I can't put my finger on the exact issue here, or how to fix it. If anyone has a suggestion or a resolution I would love to hear it!
Create your script or command to look for files older than a certain time so it's not grabbing any files still getting written to or possibly write a script to use lsof on the directory and exclude such files....
Distribution: Solaris 11.4, Oracle Linux, Mint, Debian/WSL
Posts: 9,789
Rep:
Quote:
Originally Posted by abegetchell
I'm sure this has something with the way that xargs is interpreting finds output given the files that are being created constantly, but I can't put my finger on the exact issue here, or how to fix it. If anyone has a suggestion or a resolution I would love to hear it!
The right way to do that is to use filesystem snapshots.
I don't know exactly which Linux file systems have that feature reliably available though, but Solaris ufs and zfs are doing that.
Create your script or command to look for files older than a certain time so it's not grabbing any files still getting written to or possibly write a script to use lsof on the directory and exclude such files....
This command should find all files that were created between one and ninety minutes ago. While there should be no files that were created more than sixty minutes ago in this specific directory (they're archived hourly), I made it ninety minutes for a margin of safety.
I'll post after the next hourly job if this works. Thanks for the idea!
This command should find all files that were created between one and ninety minutes ago. While there should be no files that were created more than sixty minutes ago in this specific directory (they're archived hourly), I made it ninety minutes for a margin of safety.
I'll post after the next hourly job if this works. Thanks for the idea!
The above did not work. The results were the same as in the initial post - the last few minutes of files were added to the tarball.
Last edited by abegetchell; 09-21-2006 at 01:53 PM.
Not quite sure why you require xargs, here. Can't you put find in backticks as the final argument to tar? Doesn't your way create a new tarball iteratively, thus explaining why it's size varies up and down as things procede? Just speculating here, because I've never used xargs before, and I only think I know what it says in the man page.
find is still finding things and continuously piping them to tar as it finds them, which seems a little unnecessary. Perhaps pipe the find output to a file and wait until it's all done, then give the list to tar?
Not quite sure why you require xargs, here. Can't you put find in backticks as the final argument to tar? Doesn't your way create a new tarball iteratively, thus explaining why it's size varies up and down as things procede? Just speculating here, because I've never used xargs before, and I only think I know what it says in the man page.
--- rod.
Well, xargs is required to get around the "argument list too long" issue. A great description of that problem, and an example of why and how I'm using xargs, can be found here:
find is still finding things and continuously piping them to tar as it finds them, which seems a little unnecessary. Perhaps pipe the find output to a file and wait until it's all done, then give the list to tar?
Yah, but...
Doesn't xargs invoke tar mulitple times, and on each iteration, tar creates a new tarball, replacing any pre-existing one? The term continuous, here, seems to stretch the meaning, to me. The solution you point out later looks like the definitive solution.
Perhaps if the original xargs method used tar with the '-A' (append) option, rather than '-c' (create), the xargs solution would work.
The right way to do that is to use filesystem snapshots.
I don't know exactly which Linux file systems have that feature reliably available though, but Solaris ufs and zfs are doing that.
If the OP is using LVM, then it supports snapshots (LVM2). Otherwise, unionfs can be installed and used on top of whatever underlying filesystem is there to create your snapshots.
Doesn't xargs invoke tar mulitple times, and on each iteration, tar creates a new tarball, replacing any pre-existing one? The term continuous, here, seems to stretch the meaning, to me. The solution you point out later looks like the definitive solution.
Perhaps if the original xargs method used tar with the '-A' (append) option, rather than '-c' (create), the xargs solution would work.
--- rod.
I tried the -A method, but given that this is a new tarball, that method wouldn't work. I suppose I could "pre-create" a tarball and then add files too it, but I am first going to try the method that puffinman suggests above. Getting ready to implement it now.
If the OP is using LVM, then it supports snapshots (LVM2). Otherwise, unionfs can be installed and used on top of whatever underlying filesystem is there to create your snapshots.
LVM? LVM?! We ain't got no stinkin' LVM!
I can't mess around with this system too much in regards to major system changes, as it is a very <i>very</i> busy production system. I haven't researched unionfs at all, but I imagine implementing it is not a trivial task.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.