LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-24-2008, 11:39 AM   #1
rubadub
Member
 
Registered: Jun 2004
Posts: 236

Rep: Reputation: 33
tar issues


Hi,
I'm having an investigation of tar archives. I'm running Fed8 and I archived two files, one is 28 bytes, the other 136 bytes, the archive turns out to be 10240 bytes. This seems strange?

From what I know the structure should be say a 512 byte header for the first file, then say 512 bytes of data (for 28 bytes + padding), then another 512 bytes for the second file header, then another 512 bytes of data (for 136 bytes + padding). And then I believe that there should be two blocks of zero's at 512 each. Therefore in total it should be 3072 bytes...

What is this other data?

When I 'cat' the archive (because both stored files are txt) it dumps as described at 3072 bytes, so i'm very bemused.

NB: yes it is 'ustar' format, but other than the extra header info there shouldn't be a difference should there?

Any insight please?
 
Old 06-25-2008, 12:27 AM   #2
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,349

Rep: Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750
Well, if the default block size on your disk is 4k (4096), which I believe is usual, then each file is 4096 bytes, so 2 files = 8192
+ (2 * 512 bytes per file hdr) +( 2 * 512 bytes of zeros as you describe...)

8192
1024
1024 +
-----
10240
 
Old 06-25-2008, 05:55 AM   #3
rubadub
Member
 
Registered: Jun 2004
Posts: 236

Original Poster
Rep: Reputation: 33
That starts to make sense then, but why don't the header and zeros require a full block? And overall it is 2.5 blocks in size (if at 4096)!

According to tar manual the block's are written at 512 as standard in tar files, for portability (unless specifically stated).

I've just (semi) manually dumped each section of the tar file (using fseek):

start what
-----------------------------
0: hdr
512: file (34)
1024: hdr
1536: file (136)

This dumped as expected, but what about the rest?

2048 + 1024 = 3072 in total

Surely this would fit within a single 4096 block, so why 2.5?






NOTE: (paranoia set's in...)
# rkhunter --check
This indicates no problems...
...
# chkrootkit
Quote:
Checking `lkm'... find: WARNING: Hard link count is wrong for /proc/1: this may be a bug in your filesystem driver. Automatically turning on find's -noleaf option. Earlier results may have failed to include directories that should have been searched.
Even though threads like this state that it is nothing to worry about, could it not be some kind of DKOM, not that i'm an expert (obviously)...
 
Old 06-25-2008, 06:01 PM   #4
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,349

Rep: Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750
It also depends on how you measure a file. Some tools measure the 'size taken by a file' (in in blocks), some measure actual data size eg 136 bytes...
From your desc, tar adds 512 bytes for padding; that's 512 actual data bytes, nothing to do with disk-block size.
I haven't read the tar page.
Try man du, man df for more info.
 
Old 06-27-2008, 05:11 PM   #5
rubadub
Member
 
Registered: Jun 2004
Posts: 236

Original Poster
Rep: Reputation: 33
OK, after some more tests with same results:

Total: 10240 bytes

4 blocks with content (2048 bytes)
16 blocks full of zeros (8192 bytes)


I awent a looking for more info and spent a while reading the 'info' pages and then found a copy online (http://osr507doc.sco.com/cgi-bin/inf...gz)Top&lang=en), and this seems to explain best:
Quote:
The data in an archive is grouped into blocks, which are 512 bytes.
Blocks are read and written in whole number multiples called "records".
The number of blocks in a record (ie. the size of a record in units of
512 bytes) is called the "blocking factor". The
`--blocking-factor=512-SIZE' (`-b 512-SIZE') option specifies the
blocking factor of an archive. The default blocking factor is
typically 20 (ie. 10240 bytes), but can be specified at installation.
To find out the blocking factor of an existing archive, use `tar --list
--file=ARCHIVE-NAME'. This may not work on some devices.
Thanks for help and thoughts...
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
BackUp & Restore with TAR (.tar / .tar.gz / .tar.bz2 / tar.Z) asgarcymed Linux - General 5 12-31-2006 02:53 AM
tar | ssh (tar > .tar) syntax issues EarlMosier Linux - Software 6 12-21-2006 12:28 AM
tar issues winchester169 Linux - Software 2 09-06-2006 03:05 PM
Tar archive issues yorkb Linux - Software 1 10-12-2003 05:14 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:53 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration