LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 12-15-2016, 01:55 PM   #1
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,871
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
Append files to compressed .tar.gz archive


Well, it doesn't work, because it is compressed.
Code:
$ tar czf test.tgz 1.txt 2.txt
$ tar rzf test.tgz 3.txt 4.txt
tar: Cannot update compressed archives
This doesn't work either:
Code:
$ tar czvf test.tgz 1.txt 2.txt
$ tar czvf - 3.txt 4.txt >>test.tgz
$ tar tzf test.tgz # shows only 1.txt and 2.txt
The reason is the end-of-archive marker written by the first tar (meaning: 1024 byte binary zero [edit: it is 'two blocks' actually, e.g. 2*4096 byte]).

So the next question is: how to prevent tar from writing such marker.

I am going to check the source of tar, then keep blogging.

Comments are welcome.

PS: gunzip + append + gzip sequence isn't a solution, it would be a waste of resoures.

Edit: astrogeek suggested using option -i at untar-time. (Thank you, astrogeek.) It does work:
Code:
$ tar -i -tzf test.tgz # shows all four files
Problem is, the unlucky guy who will have to use these files might be someone else in the future when I'm long dead. So it would be better if he shouldn't be using any extra options at untar-time.

Edit: it is create.c:write_eot in tar-source. It is called unconditionally. (At least in non-incremental mode. I don't know what incremental mode is, and don't wish to know.) Next idea is creating some tar-replacement (actually, I've done that on platform BS2000 some decades before), or using dd(1) to remove the last 1024 bytes of archive (which is not guaranteed to be 1024, it might be less or more...)

Edit: now this "works":
Code:
$ tar cf - 1.txt 2.txt | head -c -8192 | gzip >test.tgz
$ tar cf - 3.txt 4.txt | head -c -8192 | gzip >>test.tgz
Okay, but where this 8192 comes from? How platform-independent it is? Clearly it isn't a solution, it is a hack.

Edit: Rather:
Code:
$ tar -b 1 -cf - 1.txt 2.txt | head -c -1024 | gzip >test.tgz
$ tar -b 1 -cf - 3.txt 4.txt | head -c -1024 | gzip >>test.tgz

Last edited by NevemTeve; 12-16-2016 at 08:13 AM. Reason: more examples; incorporating suggestions
 
Old 12-15-2016, 02:01 PM   #2
Sefyir
Member
 
Registered: Mar 2015
Distribution: Linux Mint
Posts: 634

Rep: Reputation: 316Reputation: 316Reputation: 316Reputation: 316
Did you try -r?

Code:
     -r, --append
           append files to the end of an archive
EDIT: Nevermind, that is what doesn't work on compressed archives.

I don't know about wasting resources, but it looks like you may be able to parallelize it

http://askubuntu.com/a/578957

Last edited by Sefyir; 12-15-2016 at 02:07 PM.
 
1 members found this post helpful.
Old 12-15-2016, 02:16 PM   #3
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,871

Original Poster
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
Hi,thank you for your answer, I think I should edit the Original Post to make it clearer.
 
Old 12-15-2016, 02:45 PM   #4
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,269
Blog Entries: 24

Rep: Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196
The -A and -r options to tar only work with non-compressed archives.

Quote:
Originally Posted by NevemTeve View Post
Also this doesn't work:
Code:
tar czvf test.tgz 1.txt 2.txt
tar czvf - 3.txt 4.txt >>test.tgz
The reason is the end-of-archive marker written by the first tar (meaning: 1024 byte binary zero).
Actually, you can use the above method to append a compressed archive, but you then must untar using the -i option to ignore the end-of-archive markers.

The info page for tar includes some good notes on -A, -r and -i not found in the man page.

I see no way to suppress the end-of-archive markers during the creation or concatenation of compressed archives using only tar options.

Last edited by astrogeek; 12-15-2016 at 02:49 PM. Reason: better wording in last sentence
 
1 members found this post helpful.
Old 12-15-2016, 10:57 PM   #5
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,871

Original Poster
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
Thanks, I've incorporated this option into the Original Post.
 
Old 12-16-2016, 01:10 AM   #6
Sefyir
Member
 
Registered: Mar 2015
Distribution: Linux Mint
Posts: 634

Rep: Reputation: 316Reputation: 316Reputation: 316Reputation: 316
This strikes me as seeming like a XY problem

Perhaps explain what is trying to be accomplished?
If you're taking frequent backups and want to continually update backup files, incremental backup may well be what you're looking for.
 
1 members found this post helpful.
Old 12-16-2016, 01:58 AM   #7
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,871

Original Poster
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
No, these aren't the same files again and again; these files are generated at a speed of few hundreds per day; and I want to archive them [edit: after being processed] in monthly files (say save.2016.12.tar.xz); the archive-script will run daily and it will delete the archived files.

Last edited by NevemTeve; 12-16-2016 at 02:11 AM.
 
Old 12-16-2016, 07:19 AM   #8
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,784

Rep: Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083
Why not make a non-compressed tarball of compressed files?
 
1 members found this post helpful.
Old 12-16-2016, 07:50 AM   #9
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,871

Original Poster
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
Well, most certainly that would be possible, but less efficient compression-wise: tar does have some overhead (minimal size is 1024 byte for every non-empty file)
 
Old 12-16-2016, 07:26 PM   #10
Sefyir
Member
 
Registered: Mar 2015
Distribution: Linux Mint
Posts: 634

Rep: Reputation: 316Reputation: 316Reputation: 316Reputation: 316
Quote:
Originally Posted by NevemTeve View Post
No, these aren't the same files again and again; these files are generated at a speed of few hundreds per day; and I want to archive them [edit: after being processed] in monthly files (say save.2016.12.tar.xz); the archive-script will run daily and it will delete the archived files.
So you're creating several hundred unique files a day and you're trying to back them up to a single tarball representing that months files? I assume you have previous months as well (save.2016.11.tar.xz) kept as well?
Then have the current months tarball uncompressed and update it daily and all others compressed (So 09, 10, 11 compressed and 12 uncompressed). Then when the next month hits, compress the old one and create a new one.

That should be a good balance of performance, compression and ease of implementation.

On a side note:
I'd be cautious doing this method of backing up. I don't know how well tar responds to errors. If a error corrupts the tarball during a update, you'll lose the whole month of data.
I might suggest rsyncing data from the "live" directory to a "backup" directory, continually adding files to it until the end of the month, tarballing + compressing it, deleting the extra files and thus reset for the next month.

Last edited by Sefyir; 12-16-2016 at 07:28 PM.
 
1 members found this post helpful.
Old 01-01-2017, 01:33 AM   #11
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 4,871

Original Poster
Blog Entries: 1

Rep: Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871Reputation: 1871
Now that file for December is finished, I reorganized it to see how much space I have to lost due to the daily append. (Note: actually it is 'xz' not 'gzip')

Code:
xzcat save.201612_orig.tar.xz | xz >save.201612_reorg.tar.xz 

-rw-rw-r-- 1 projects grp 359528 Dec 31 04:20 save.201612_orig.tar.xz
-rw-rw-r-- 1 projects grp 305820 Jan  1 08:15 save.201612_reorg.tar.xz
Guess it is acceptable.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Trying to append file to tar archive returns error “Cannot stat: No such file or dire Simon_zhu Linux - Newbie 3 06-11-2015 11:02 PM
[SOLVED] Using split and tar to get the compressed files ytyyutianyun Linux - Newbie 10 12-12-2013 07:08 PM
gzip: stdin: invalid compressed data--format violated; tar: Unexpected EOF in archive mcgao07 Linux - Newbie 1 05-25-2012 01:44 PM
Multi-disk Copy Program Suggestion (Without Archive or Compressed Files) gnurob Linux - Software 9 03-07-2009 03:17 PM
Trying to Append files to an already made Tar bluedevlx Linux - Newbie 10 11-04-2004 07:37 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 08:13 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration