LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   tar issues (https://www.linuxquestions.org/questions/programming-9/tar-issues-651315/)

rubadub 06-24-2008 11:39 AM

tar issues
 
Hi,
I'm having an investigation of tar archives. I'm running Fed8 and I archived two files, one is 28 bytes, the other 136 bytes, the archive turns out to be 10240 bytes. This seems strange?

From what I know the structure should be say a 512 byte header for the first file, then say 512 bytes of data (for 28 bytes + padding), then another 512 bytes for the second file header, then another 512 bytes of data (for 136 bytes + padding). And then I believe that there should be two blocks of zero's at 512 each. Therefore in total it should be 3072 bytes...

What is this other data?

When I 'cat' the archive (because both stored files are txt) it dumps as described at 3072 bytes, so i'm very bemused.

NB: yes it is 'ustar' format, but other than the extra header info there shouldn't be a difference should there?

Any insight please?

chrism01 06-25-2008 12:27 AM

Well, if the default block size on your disk is 4k (4096), which I believe is usual, then each file is 4096 bytes, so 2 files = 8192
+ (2 * 512 bytes per file hdr) +( 2 * 512 bytes of zeros as you describe...)

8192
1024
1024 +
-----
10240

rubadub 06-25-2008 05:55 AM

That starts to make sense then, but why don't the header and zeros require a full block? And overall it is 2.5 blocks in size (if at 4096)!

According to tar manual the block's are written at 512 as standard in tar files, for portability (unless specifically stated).

I've just (semi) manually dumped each section of the tar file (using fseek):

start what
-----------------------------
0: hdr
512: file (34)
1024: hdr
1536: file (136)

This dumped as expected, but what about the rest?

2048 + 1024 = 3072 in total

Surely this would fit within a single 4096 block, so why 2.5?






NOTE: (paranoia set's in...)
# rkhunter --check
This indicates no problems...
...
# chkrootkit
Quote:

Checking `lkm'... find: WARNING: Hard link count is wrong for /proc/1: this may be a bug in your filesystem driver. Automatically turning on find's -noleaf option. Earlier results may have failed to include directories that should have been searched.
Even though threads like this state that it is nothing to worry about, could it not be some kind of DKOM, not that i'm an expert (obviously)...

chrism01 06-25-2008 06:01 PM

It also depends on how you measure a file. Some tools measure the 'size taken by a file' (in in blocks), some measure actual data size eg 136 bytes...
From your desc, tar adds 512 bytes for padding; that's 512 actual data bytes, nothing to do with disk-block size.
I haven't read the tar page.
Try man du, man df for more info.

rubadub 06-27-2008 05:11 PM

OK, after some more tests with same results:

Total: 10240 bytes

4 blocks with content (2048 bytes)
16 blocks full of zeros (8192 bytes)


I awent a looking for more info and spent a while reading the 'info' pages and then found a copy online (http://osr507doc.sco.com/cgi-bin/inf...gz)Top&lang=en), and this seems to explain best:
Quote:

The data in an archive is grouped into blocks, which are 512 bytes.
Blocks are read and written in whole number multiples called "records".
The number of blocks in a record (ie. the size of a record in units of
512 bytes) is called the "blocking factor". The
`--blocking-factor=512-SIZE' (`-b 512-SIZE') option specifies the
blocking factor of an archive. The default blocking factor is
typically 20 (ie. 10240 bytes), but can be specified at installation.
To find out the blocking factor of an existing archive, use `tar --list
--file=ARCHIVE-NAME'. This may not work on some devices.
Thanks for help and thoughts...


All times are GMT -5. The time now is 03:46 PM.