LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   why `tr "\000" "\125" < /dev/zero | dd bs=1K count=1 of=01data` creates wrong size? (http://www.linuxquestions.org/questions/linux-newbie-8/why-%60tr-%5C000-%5C125-dev-zero-%7C-dd-bs%3D1k-count%3D1-of%3D01data%60-creates-wrong-size-913570/)

albert316 11-14-2011 10:55 PM

why `tr "\000" "\125" < /dev/zero | dd bs=1K count=1 of=01data` creates wrong size?
 
I tried a command "tr "\000" "\125" < /dev/zero | dd bs=1M count=1 of=01data", then the size of 01data is not 1MB...

Code:

tr "\000" "\125" < /dev/zero | dd bs=1M count=1 of=01data
0+1 records in
0+1 records out
856064 bytes (856 kB) copied, 0.0115292 s, 74.3 MB/s

how can i create a such file whose size is 1MB?

Thanks.

albert316 11-15-2011 12:12 AM

Quote:

Originally Posted by albert316 (Post 4524185)
I tried a command "tr "\000" "\125" < /dev/zero | dd bs=1M count=1 of=01data", then the size of 01data is not 1MB...

Code:

tr "\000" "\125" < /dev/zero | dd bs=1M count=1 of=01data
0+1 records in
0+1 records out
856064 bytes (856 kB) copied, 0.0115292 s, 74.3 MB/s

how can i create a such file whose size is 1MB?

Thanks.

using tr "\000" "\125" < /dev/zero | dd bs=1K count=1024 of=01data

colucix 11-15-2011 02:38 AM

It should do exactly what you're trying to do. The difference could be in the units of bs. K means 1024 bytes, whereas kB means exactly 1000 bytes. Take a look at the man page of dd:
Quote:

BLOCKS and BYTES may be followed by the following multiplicative suffixes: c =1, w =2, b =512, kB=1000, K =1024, MB=1000*1000, M =1024*1024, xM =M GB =1000*1000*1000, G =1024*1024*1024, and so on for T, P, E, Z, Y.

albert316 11-15-2011 07:42 AM

Quote:

Originally Posted by colucix (Post 4524310)
It should do exactly what you're trying to do. The difference could be in the units of bs. K means 1024 bytes, whereas kB means exactly 1000 bytes. Take a look at the man page of dd:

thanks for your reply.

There is a thing i don't understand. Every time i execute "tr "\000" "\125" < /dev/zero | dd bs=1M count=1 of=QWER",

the size of QWER is not constant. sometimes it is 7xKB, sometimes it is 8xxKB, 7xxKB...

colucix 11-15-2011 08:13 AM

Quote:

Originally Posted by albert316 (Post 4524533)
There is a thing i don't understand. Every time i execute "tr "\000" "\125" < /dev/zero | dd bs=1M count=1 of=QWER",

the size of QWER is not constant. sometimes it is 7xKB, sometimes it is 8xxKB, 7xxKB...

Actually your command should generate a file of exactly 1048576 bytes. Where do you see 7xKB or 8xxKB. Please, show us your command and the long listing of the newly created file, e.g.
Code:

$ tr "\000" "\125" < /dev/zero | dd bs=1M count=1 of=QWER
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0199836 s, 52.5 MB/s

$ ls -l QWER
-rw-r--r--. 1 colucix users 1048576 Nov 15 14:10 QWER
$ stat QWER
  File: `QWER'
  Size: 1048576      Blocks: 2048      IO Block: 4096  regular file
Device: 809h/2057d    Inode: 1049722    Links: 1
Access: (0644/-rw-r--r--)  Uid: (  500/ colucix)  Gid: (  100/  users)
Access: 2011-11-15 14:10:16.747819246 +0100
Modify: 2011-11-15 14:10:16.761414586 +0100
Change: 2011-11-15 14:10:16.761414586 +0100


GazL 11-15-2011 09:31 AM

I'm seeing this too.

Made some headway. I understand what, but not why.....

Code:

gazl@slack:/tmp$ dd if=/dev/zero bs=1M count=1 of=/tmp/out1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.000897899 s, 1.2 GB/s
gazl@slack:/tmp$ cat /dev/zero | dd bs=1M count=1 of=/tmp/out2
0+1 records in
0+1 records out
98304 bytes (98 kB) copied, 0.00011617 s, 846 MB/s
gazl@slack:/tmp$ ls -l out*
-rw-r--r-- 1 gazl users 1048576 Nov 15 14:24 out1
-rw-r--r-- 1 gazl users  98304 Nov 15 14:24 out2

But count=2 gives us a clue
Code:

gazl@slack:/tmp$ cat /dev/zero | dd bs=1M count=2 of=/tmp/out3
dd: warning: partial read (98304 bytes); suggest iflag=fullblock
0+2 records in
0+2 records out
196608 bytes (197 kB) copied, 0.000294146 s, 668 MB/s

Why we're not seeing that warning when count=1 I don't know but from experimentation it seems that when using piped input dd needs iflag=fullblock for any blocksize over 98304

Code:

gazl@slack:/tmp$ cat /dev/zero | dd iflag=fullblock bs=1M count=1 of=/tmp/out4
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00135746 s, 772 MB/s
gazl@slack:/tmp$ ls -l out*                                                 
-rw-r--r-- 1 gazl users 1048576 Nov 15 14:24 out1
-rw-r--r-- 1 gazl users  98304 Nov 15 14:24 out2
-rw-r--r-- 1 gazl users  196608 Nov 15 14:25 out3
-rw-r--r-- 1 gazl users 1048576 Nov 15 14:27 out4
gazl@slack:/tmp$

Something feels broken here, but I really don't know what is going on (my system is heavily messed around with so it's quite possible I broke something myself)

colucix 11-15-2011 09:45 AM

Thanks GazL. I cannot reproduce this behaviour on my current system (CentOS 6 2.6.32-71.29.1.el6.i686) since it gives me always the same and correct result. Anyway, the fullblock workaround is fine: maybe it prevents some latency in the I/O stream.

GazL 11-15-2011 10:09 AM

I'd certainly consider fullblock nothing more than a workaround as it shouldn't be necessary and it suggests to me that something is broken.


I see this with both coreutils 8.11 and 8.12, with glibc-2.13 and kernel 3.1.1. Slackware64 13.37 (heavily messed with)

Problem is my system is a bit of a plaything: I have updated the kernel, headers, glibc and coreutils locally so the scope for having broken something myself is quite wide and it's going to make identifying the cause somewhat difficult.

rknichols 11-15-2011 10:14 AM

The "count=n" operand doesn't mean "n full blocks". When 'dd' performs a read from a pipe, the amount of data returned by the read cannot exceed the amount of data was buffered in the pipe at that moment. If you haven't told dd to do something different (e.g., with the "fullblock" iflag), it's going to treat whatever it's got as a block. You'll get similar results from communications lines, tape drives with variable block size, ..., any device providing a data stream where all of the data you requested might not be immediately available.

GazL 11-15-2011 10:25 AM

That explanation makes sense, but why does colucix older dd work? Is this a change of behaviour in more recent versions of dd?

Also, if what you say is the case then the 'partial read' warning ought to be displayed for any value of count=n on receipt of an incomplete block, which wasn't happening on a count=1.That has to be considered a bug

GazL 11-15-2011 12:16 PM

Found this: https://bugzilla.redhat.com/show_bug.cgi?id=668247
... which is talking about ibs= but seems related to this topic and confirms what rknichols was saying above.

So, looks like dd used to cope with this stuff, but they changed it at some point and this new behaviour is now considered "working as designed".

This looks like one to remember. I can see this catching a lot of people out.


On the plus side, it appears I haven't broken my system after all. ;)

rknichols 11-16-2011 11:08 AM

Quote:

Originally Posted by GazL (Post 4524670)
Also, if what you say is the case then the 'partial read' warning ought to be displayed for any value of count=n on receipt of an incomplete block, which wasn't happening on a count=1.That has to be considered a bug

The problem is that a short read is normal if there is no more data to come, and dd cannot tell whether or not that is the case without going back and trying another read, which "count=1" (in the absence of "iflag=fullblock") has specifically told it not to do.

FWIW, the amount of data returned by that single read from the pipe appears to depend on the glibc version. On a system with glibc-2.11 I consistently see 8192 bytes. On systems with glibc-2.12 (Scientific Linux 6) and glibc-2.14 (Fedora 16) I see a variable amount. What's odd is that SL-6 should be the same as CentOS 6, which colucix reports as consistently yielding the full 1-megabyte block.


All times are GMT -5. The time now is 04:46 PM.