[SOLVED] View binary files as black and white movies to assess randomness

H_TeXMeX_H · 03-13-2012, 09:08 AM

It probably sounds crazy, and it probably is, but I want to view binary files as black and white movies, so that they would appear like the static on TV. I want to do this or something similar, because I want a way to better analyze "random" files or data.

I know there exist programs like ENT and some others that use statistics to tell me if a file is likely to be random or not. They work well for distinguishing random and non-random files, but they don't work well for distinguishing random from pseudorandom files. I was thinking, maybe viewing it in the form of a movie or visual display of some kind (other than hexedit), may help. In theory a pattern would show up for pseudorandom data, while there would be no pattern for random data.

I will also accept other suggestions / solutions that accomplish the same goal. However, I think the project would still be cool to accomplish.

My current thoughts are to use some graphics engine to put pixels on a display based on the binary 1/0 of a file. Not sure how efficient this will be, so I may need to try something else. I know some C and bash. I've searched for programs like this, but I don't see any. The closest I have found is the histogram program, but it may be in Russian or Ukrainian:
http://sourceforge.net/projects/rand...urce=directory

Guttorm · 03-13-2012, 09:30 AM

Hi

Some info here:

http://www.random.org/analysis/

To generate a picture, a simple solution would be to use the PNM format.

http://en.wikipedia.org/wiki/Netpbm_format

The format is very simple, and you can change the format with for example pnmtopng.

Dark_Helmet · 03-13-2012, 09:38 AM

It should be fairly simple to write a C-Xlib program that will animate the random data.

Really though, at a conceptual level, you have a difficult question to answer: what width and height for your display? Any width and height you choose could potentially mask/obscure any recurring sequence of data.

A Beginner Xlib Tutorial

You would obviously rely heavily on either XDrawPoint() or XDrawPoints() for your drawing mechanism. Combine that with a sleep call after re-drawing the screen, and you have the basics of animation.

You could also look into double-buffering to make the animation "cleaner." I've got some code in a C++ project that I could try to convert to straight-C if you get stuck somewhere.

GamezR2EZ · 03-13-2012, 09:39 AM

I want to start off by saying I do not think this will work, but it is a cool project.
I did something like with with a friend for prime numbers.

The plan we used for the program is pretty simple. The code is C# as he only uses Windows, though.

We created a bitmap based on a boolean array (prime, not prime).
We kept the dimensions of the bitmap variable.

You could do the same thing only create a gif, as you would want multiple frames.

Here is what I would suggest if you actually wanted to start this project:
Go find some code online that can create a gif based on input data (fractal programs should have such code).
Modify the code to support the input you are going to give it (you said binary).
Create some static test strings to make sure it generates as you would expect.
THEN start working on file input.

The file input should be fairly simple at that point.
May I also suggest viewing the data as a byte instead of bit? 8-bit color is cool and you can use the 2d array you read the data into as the variable itself.

All-in-all you can probably get this whole thing created for 100 lines of code you write + the gif code.

Guttorm · 03-13-2012, 09:49 AM

Here's an example using the netpbm tools:

Code:

dd if=/dev/urandom bs=1 count=10000 | rawtopgm 100 100 | pnmtopng > picture.png

H_TeXMeX_H · 03-13-2012, 09:56 AM

Thanks for the ideas, I will consider each one. Currently I am thinking maybe I will generate a bitmap, convert it to jpeg, then to mjpeg.

Many thanks to Guttorm for working code !

H_TeXMeX_H · 03-13-2012, 10:21 AM

Ok, here's a quick hack of what I wanted:

Code:

for i in $(seq 1 100); do dd if=/dev/urandom bs=4M | rawtoppm 640 480 | ppmtojpeg > "$(printf '%03d' $i)".jpeg; done
ffmpeg -r 20 -sameq -i '%03d.jpeg' test.mjpeg

Fix as needed for varied effects.

I may still work on something that will let me play files directly via piping.

TobiSGD · 03-13-2012, 10:52 AM

JPEG is a format with lossy compression. I would think that may be counter-productive to your goal, it would be better to use a format that either does no compression or lossless compression.

GamezR2EZ · 03-13-2012, 10:52 AM

Now correct me if I am wrong, but that is extremely slow.
For 18mb or files it took me 30 seconds.

Were you planning on using it on large files?

H_TeXMeX_H · 03-13-2012, 01:29 PM

Quote:

Originally Posted by GamezR2EZ

Now correct me if I am wrong, but that is extremely slow.
For 18mb or files it took me 30 seconds.

Were you planning on using it on large files?

Indeed it does take a while, and I may be using it on large files. I will look into optimizing the chain of commands, although netpbm tools are very fast from what I've seen in other uses.

Quote:

Originally Posted by TobiSGD

JPEG is a format with lossy compression. I would think that may be counter-productive to your goal, it would be better to use a format that either does no compression or lossless compression.

It would be, but I'm not sure how to convert it to a movie. I'll look into it tho.

dugan · 03-13-2012, 11:21 PM

Why not convert the file to audio and then play it with an audio player that has a visualizer?

TobiSGD · 03-14-2012, 09:08 AM

Quote:

Originally Posted by H_TeXMeX_H

It would be, but I'm not sure how to convert it to a movie. I'll look into it tho.

Don't forget that your movie has to be uncompressed, too.

H_TeXMeX_H · 03-14-2012, 03:09 PM

I have found the best solution. Thanks to all that helped.

The solution cannot be a movie, because as TobiSGD says, it must be uncompressed and lossless for an accurate view.

It also takes a long time to make such a movie. So, there's only one solution that works for me.

Code:

#!/bin/sh

if test $# != 1
then
	echo "Usage: $(basename $0) input"
	echo 'input: input file'
	exit 1
fi

side="$(stat -c '%s' "$1" | awk '{ print int($1^(1/2)) }')"

rawtopgm "$side" "$side" "$1"

exit 0

This works for files only and I made it output only to stdout so I can either output to a file or pipe to xv or display. It truncates a bit because it generates a square image and I don't know what to do with the rest of the bytes, but it's good enough.

The color representation is harder to interpret, so I chose black and white. The conversion is lossless, fast, and accurate.

Thanks again to Guttorm for pretty much the solution, but the others did help me realize that this was the best solution

I notice that non-random files have patterns in them, typically horizontal streaks. I've tested some PRNGs as well, and I can say that /dev/urandom, shred, and wipe all work very well. I've also used ENT to verify this. I think this program will come in handy when working with files that are supposed to be random and PRNGs or RNGs.

EDIT:
A histogram can be seen in GIMP when analyzing the image. It also helps.