[SOLVED] View binary files as black and white movies to assess randomness
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
View binary files as black and white movies to assess randomness
It probably sounds crazy, and it probably is, but I want to view binary files as black and white movies, so that they would appear like the static on TV. I want to do this or something similar, because I want a way to better analyze "random" files or data.
I know there exist programs like ENT and some others that use statistics to tell me if a file is likely to be random or not. They work well for distinguishing random and non-random files, but they don't work well for distinguishing random from pseudorandom files. I was thinking, maybe viewing it in the form of a movie or visual display of some kind (other than hexedit), may help. In theory a pattern would show up for pseudorandom data, while there would be no pattern for random data.
I will also accept other suggestions / solutions that accomplish the same goal. However, I think the project would still be cool to accomplish.
My current thoughts are to use some graphics engine to put pixels on a display based on the binary 1/0 of a file. Not sure how efficient this will be, so I may need to try something else. I know some C and bash. I've searched for programs like this, but I don't see any. The closest I have found is the histogram program, but it may be in Russian or Ukrainian: http://sourceforge.net/projects/rand...urce=directory
It should be fairly simple to write a C-Xlib program that will animate the random data.
Really though, at a conceptual level, you have a difficult question to answer: what width and height for your display? Any width and height you choose could potentially mask/obscure any recurring sequence of data.
You would obviously rely heavily on either XDrawPoint() or XDrawPoints() for your drawing mechanism. Combine that with a sleep call after re-drawing the screen, and you have the basics of animation.
You could also look into double-buffering to make the animation "cleaner." I've got some code in a C++ project that I could try to convert to straight-C if you get stuck somewhere.
I want to start off by saying I do not think this will work, but it is a cool project.
I did something like with with a friend for prime numbers.
The plan we used for the program is pretty simple. The code is C# as he only uses Windows, though.
We created a bitmap based on a boolean array (prime, not prime).
We kept the dimensions of the bitmap variable.
You could do the same thing only create a gif, as you would want multiple frames.
Here is what I would suggest if you actually wanted to start this project:
Go find some code online that can create a gif based on input data (fractal programs should have such code).
Modify the code to support the input you are going to give it (you said binary).
Create some static test strings to make sure it generates as you would expect.
THEN start working on file input.
The file input should be fairly simple at that point.
May I also suggest viewing the data as a byte instead of bit? 8-bit color is cool and you can use the 2d array you read the data into as the variable itself.
All-in-all you can probably get this whole thing created for 100 lines of code you write + the gif code.
JPEG is a format with lossy compression. I would think that may be counter-productive to your goal, it would be better to use a format that either does no compression or lossless compression.
Now correct me if I am wrong, but that is extremely slow.
For 18mb or files it took me 30 seconds.
Were you planning on using it on large files?
Indeed it does take a while, and I may be using it on large files. I will look into optimizing the chain of commands, although netpbm tools are very fast from what I've seen in other uses.
Quote:
Originally Posted by TobiSGD
JPEG is a format with lossy compression. I would think that may be counter-productive to your goal, it would be better to use a format that either does no compression or lossless compression.
It would be, but I'm not sure how to convert it to a movie. I'll look into it tho.
I have found the best solution. Thanks to all that helped.
The solution cannot be a movie, because as TobiSGD says, it must be uncompressed and lossless for an accurate view.
It also takes a long time to make such a movie. So, there's only one solution that works for me.
Code:
#!/bin/sh
if test $# != 1
then
echo "Usage: $(basename $0) input"
echo 'input: input file'
exit 1
fi
side="$(stat -c '%s' "$1" | awk '{ print int($1^(1/2)) }')"
rawtopgm "$side" "$side" "$1"
exit 0
This works for files only and I made it output only to stdout so I can either output to a file or pipe to xv or display. It truncates a bit because it generates a square image and I don't know what to do with the rest of the bytes, but it's good enough.
The color representation is harder to interpret, so I chose black and white. The conversion is lossless, fast, and accurate.
Thanks again to Guttorm for pretty much the solution, but the others did help me realize that this was the best solution
I notice that non-random files have patterns in them, typically horizontal streaks. I've tested some PRNGs as well, and I can say that /dev/urandom, shred, and wipe all work very well. I've also used ENT to verify this. I think this program will come in handy when working with files that are supposed to be random and PRNGs or RNGs.
EDIT:
A histogram can be seen in GIMP when analyzing the image. It also helps.
Last edited by H_TeXMeX_H; 03-14-2012 at 03:11 PM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.