LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 06-08-2016, 06:02 PM   #1
SaintDanBert
Senior Member
 
Registered: Jan 2009
Location: "North Shore" Louisiana USA
Distribution: Mint-20.1 with Cinnamon
Posts: 1,771
Blog Entries: 3

Rep: Reputation: 108Reputation: 108
seeking safe way to run long (clock time) 'tar' command


How can I run a long elapsed time tar operation so that it can "resume" after an interruption without doing a complete restart of the tar command?

NOTE -- Please focus on suggestions that rely on a tar-archive as the result. I know there are other options, but tar-archives are an integral part of our operation.

I know that there is tar --append that will let me put more files at the end of an existing archive. Also, there is tar --concatenate that will put one or more exiting source tar-archive on the end of a target tar-archive.

What I'm really having trouble with is how to know what I've already processed and which files I have yet to process.

When I connect "staff" or "family" laptop to my home-office network, my file server grabs a tar-archive snapshot of that laptop's files. These often have long run times for a variety of reasons -- exclusions, quantity and size of new or changed files, number of snapshots running at the same time, etc. A different group of reasons result in one or more tar operations getting interrupted before they are completed.

I hate to use a win-dose example, but we used to be able to do something like this:
  • set the ARCHIVE bit on a group of files
  • xcopy /M {source} {destination} where {source} has the bit set
  • during the xcopy /M, each successful action cleared the ARCHIVE bit.
  • if the xcopy /M was interrupted or failed, a simple command repeat would only process files that still have the bit set
{giggle} one frequent use of this was to copy files onto diskette or cartridge media that were often quite small. You would xcopy /M repeatedly. It would fail when your {destination} media was full, then you'd repeat until you had a pile of media holding all of your {source} files.{giggle -still}

Thanks in advance,
~~~ 0;-Dan
 
Old 06-08-2016, 08:20 PM   #2
jefro
Moderator
 
Registered: Mar 2008
Posts: 21,980

Rep: Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624
I get the feeling that you may be able to pipe some transport that has an ability to resume to tar.
Might look at rsync, scp,wget, maybe even some ftp or even http.
Haven't done that myself.

In the end you don't really care how the files got there, you simply are saying that you want a tar or tar compressed file eventually correct?
 
1 members found this post helpful.
Old 06-10-2016, 10:29 AM   #3
SaintDanBert
Senior Member
 
Registered: Jan 2009
Location: "North Shore" Louisiana USA
Distribution: Mint-20.1 with Cinnamon
Posts: 1,771

Original Poster
Blog Entries: 3

Rep: Reputation: 108Reputation: 108
Quote:
Originally Posted by jefro View Post
I get the feeling that you may be able to pipe some transport that has an ability to resume to tar.
Might look at rsync, scp,wget, maybe even some ftp or even http.
Haven't done that myself.

In the end you don't really care how the files got there, you simply are saying that you want a tar or tar compressed file eventually correct?
You are correct with one additional detail. I don't want (can't use) an implementation that relies on creation of one large archive that gets split after the fact. I know that I can (*) tar this folder, (*) tar that folder, (*) ..., (*) tar last folder to get a bunch of chunks. I would be able to restart the chunk that failed and continue with the remaining chunks. Such an implementation requires that each folder have similar content profiles. That is not my situation.

I'm hoping to find some technique that will checkpoint the operation in progress and enable me to resume if the operation gets interrupted or fails.

~~~ 0;-Dan
 
Old 06-10-2016, 11:14 AM   #4
Habitual
LQ Veteran
 
Registered: Jan 2011
Location: Abingdon, VA
Distribution: Catalina
Posts: 9,374
Blog Entries: 37

Rep: Reputation: Disabled
Quote:
Originally Posted by SaintDanBert View Post
I'm hoping to find some technique that will checkpoint the operation in progress and enable me to resume if the operation gets interrupted or fails.

~~~ 0;-Dan
Smacks of rsync.
 
1 members found this post helpful.
Old 06-16-2016, 02:26 PM   #5
SaintDanBert
Senior Member
 
Registered: Jan 2009
Location: "North Shore" Louisiana USA
Distribution: Mint-20.1 with Cinnamon
Posts: 1,771

Original Poster
Blog Entries: 3

Rep: Reputation: 108Reputation: 108
Quote:
Originally Posted by Habitual View Post
Smacks of rsync.
It's the interrupted creation of a compressed archive that is causing me grief.
  • I can tell tar to make chunks by declaring a tape size with tar --multi-volume --tape-length NNN ....
  • I can use all sorts of commands to feed lists of files to the tar command.
Here are the troubles:
  1. For any given list of files, how do you know what has already been processed before the interruption?
    (The old ms-dos command, xcopy, used file attributes to "mark files I've worked".)
  2. If tar gets interrupted, you are more likely left with a corrupt archive and must start over.
  3. There is not enough space to make a massive tar-ball and then split the large file.

ASIDE
A previous post mentioned one ancient situation when this sort of processing was important.
I'm sure there are other, more current, situations when this sort of data collection chunking applies.
If readers know of chunking applications, please add your tuppence.

Thanks in advance,
~~~ 0;-Dan
 
Old 06-16-2016, 03:40 PM   #6
Habitual
LQ Veteran
 
Registered: Jan 2011
Location: Abingdon, VA
Distribution: Catalina
Posts: 9,374
Blog Entries: 37

Rep: Reputation: Disabled
http://unix.stackexchange.com/questi...ich-was-killed
May offer one solution.

You're out of disk space, or not enough free space to produce a working archive?
Ask for help and post NO DETAILS?

Last edited by Habitual; 06-16-2016 at 04:04 PM.
 
Old 06-29-2016, 04:24 AM   #7
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,359

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
Definitely rsync:

1. Checkpointing: basically, it checksums the src and target lists and figures out which files have changed since the last run (this includes identifying new files which are deemed to have changed 100%)

2. Deltas: by default it only transmits differences, so for previously known files, it only sends the differences, which reduces bandwidth and speeds up transmission time. Obviously new files are sent completely.

3. Security: you can tell rsync to use ssh as the transport protocol

4. If you really want tar files (pref tar.gz to save size) at the end of the process, you can do that before or after transmission.
By the sounds of the limits on the sending system, I'd do it afterwards on the target end.

It has a fair number of options; some people even use it for local copying
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How do I run a process in background for long periods of time? Zelly Linux - Newbie 10 01-21-2013 10:54 PM
Postgresql database reload taking excessively long time on SuSE 11 box - seeking caus EnderX Linux - General 1 09-15-2011 04:46 PM
clock-adjustment time is offseet from clock-display time bezdomny Linux - Desktop 2 11-19-2008 02:48 PM
Various clock issues: Clock shows wrong time only in Knoppmyth, & CMOS time change ? davidbix General 1 04-05-2006 09:58 PM
Command to time how long a program runs jing Linux - Newbie 1 07-19-2004 10:35 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 05:00 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration