LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   How to convert binary file to ascii file? (https://www.linuxquestions.org/questions/linux-newbie-8/how-to-convert-binary-file-to-ascii-file-755543/)

idaham 09-16-2009 02:12 AM

How to convert binary file to ascii file?
 
Hello linux experts!
I have a question about binary files. I've googled the problem, but didn't find anything useful so I figured I'd ask here...

I have a binary file (extension .bin) that I would like to convert to ascii somehow. The data in the binary file is 32bit float numbers. I've done the conversion so far by reading the file into matlab as float32, then saving it as ascii resulting in a file with a nice column of numbers. I have a lot of these binary files, so I figured there must be a better way to do this directly in linux! I run centOS 5.3.

Any ideas? Many thanks!
Ida

antegallya 09-16-2009 04:18 AM

Hello,
have a look at hexdump (man 1 hexdump)
Code:

hexdump -v -e '7/4 "%10d "' -e '"\n"' yourfile
will do what you seem to want (read 4 bytes and print them as one 32bits integer of 10 characters padded with spaces, 7 times per line of output separated by a space so it fits in 80 columns).

lutusp 09-16-2009 04:55 AM

Quote:

Originally Posted by idaham (Post 3684631)
Hello linux experts!
I have a question about binary files. I've googled the problem, but didn't find anything useful so I figured I'd ask here...

I have a binary file (extension .bin) that I would like to convert to ascii somehow. The data in the binary file is 32bit float numbers. I've done the conversion so far by reading the file into matlab as float32, then saving it as ascii resulting in a file with a nice column of numbers. I have a lot of these binary files, so I figured there must be a better way to do this directly in linux! I run centOS 5.3.

Any ideas? Many thanks!
Ida

You haven't said what problem this solves. Do you need the data to be in ASCII form so it can be put in an e-mail? That's one of the few remaining reasons to make this conversion. Or do you want it in the form of 32-bit integers instead of raw binary? That's a pretty good reason.

But the solution depends on what the data is. There are plenty of ways to efficiently pack binary data into something that passes muster as ASCII text, but it should answer a real need.

If you want a reasonably efficient solution that creates plain ASCII out of arbitrary binary data and the reverse, do this:

Code:

$ base64 < binary.file > text.file
-- and --

Code:

$ base64 -d < text.file > binary.file

idaham 09-16-2009 05:42 AM

Hello and thanks for your help!
The binary data I'm working with is a 3D image, with different voxel values of course. These values are 32bit floats, and I want to be able to extract these voxel values from the binary file. I want to save the voxel values in a .dat-file with the first line describing the number of voxels in each direction (x,y,z) followed by a long sequence of voxel values from top left corner of image to bottom right, e.g.

128, 128, 50
10
0
103
57
.
.

Cheers,
Ida

Quote:

Originally Posted by lutusp (Post 3684804)
You haven't said what problem this solves. Do you need the data to be in ASCII form so it can be put in an e-mail? That's one of the few remaining reasons to make this conversion. Or do you want it in the form of 32-bit integers instead of raw binary? That's a pretty good reason.

But the solution depends on what the data is. There are plenty of ways to efficiently pack binary data into something that passes muster as ASCII text, but it should answer a real need.

If you want a reasonably efficient solution that creates plain ASCII out of arbitrary binary data and the reverse, do this:

Code:

$ base64 < binary.file > text.file
-- and --

Code:

$ base64 -d < text.file > binary.file


lutusp 09-16-2009 12:04 PM

Quote:

Originally Posted by idaham (Post 3684854)
Hello and thanks for your help!
The binary data I'm working with is a 3D image, with different voxel values of course. These values are 32bit floats, and I want to be able to extract these voxel values from the binary file. I want to save the voxel values in a .dat-file with the first line describing the number of voxels in each direction (x,y,z) followed by a long sequence of voxel values from top left corner of image to bottom right, e.g.

128, 128, 50
10
0
103
57
.
.

Cheers,
Ida

Well, IMHO this should have been your first post, because extracting textual representations of binary data is in my view an admirable goal (more portable over time, easier for a human to read, more easily archivable, etc.).

But ... we would need to know in detail how the data are represented in the binary file. Is it just a long, uninterrupted stream of 32-bit integers with no other structure? Are the integers signed or unsigned? What program created the data and does that program have the ability to read what it writes?

You've given some hints about the nature of the data, but to get the data out of the file I think we would need to know more about it than we do at the moment. Are the integers arranged as vectors (x,y,z) or is the position of the values part of a fixed-dimension spatial cube with no gaps (if the latter, the file will be huge).

It's important to understand that files like this don't generally contain what is in essence a representation of 3D space with no gaps in coverage (because it's so inefficient), in the form of a rectilinear data stream running from one corner to the diagonally opposite corner (arranged as lines). In any case, we would need to know this.

We would also need to know how the color data are coded in the 32-bit integers. And at this point I am tempted to say this might be more efficiently coded as a C or C++ program, just because we would have more control over how the data bytes are turned into 32-bit integers.

Finally, chances are the file has a header, telling us what's in the file. We would need to decode the header as a first step, if only to prevent it getting in the way of an accurate scan of the data.

Conclusion? This is a bit more complex than simply streaming binary into ASCII data.

idaham 09-17-2009 01:53 AM

Hello again Paul!
Geez, I thought this was probably an easy thing, just a simple command in linux that would do the trick... This goes to show you how little I know about linux! =)
Well, I get the binary files from a kind of simulation program that simulates a human body, called XCAT. The program is precompiled so I'm not sure about how it works. It says in the manual however, that the voxelized phantoms are saved as raw binary files with no header. There's no graphics package in that program, but I've opened it in another medical image software (called amide) as a raw binary file where I had to specify the x,y,z dimensions and type first. It looked really bad for any setting except float 32 little endian so... I'm sorry, I forgot to tell you that I know the dimensions of the 3D image before hand! For this case it's 128,128,600. I'm not sure I can attach a file, cause it's 37MB large. As I mentioned before, I get the correct values when I read it into Matlab (fopen then fread) and specify float32 for fread. I dunno if this helps at all...
I'm sorry if this thread is getting longer and longer! Thanks for taking so much time to help me,
/Ida


Quote:

Originally Posted by lutusp (Post 3685412)
Well, IMHO this should have been your first post, because extracting textual representations of binary data is in my view an admirable goal (more portable over time, easier for a human to read, more easily archivable, etc.).

But ... we would need to know in detail how the data are represented in the binary file. Is it just a long, uninterrupted stream of 32-bit integers with no other structure? Are the integers signed or unsigned? What program created the data and does that program have the ability to read what it writes?

You've given some hints about the nature of the data, but to get the data out of the file I think we would need to know more about it than we do at the moment. Are the integers arranged as vectors (x,y,z) or is the position of the values part of a fixed-dimension spatial cube with no gaps (if the latter, the file will be huge).

It's important to understand that files like this don't generally contain what is in essence a representation of 3D space with no gaps in coverage (because it's so inefficient), in the form of a rectilinear data stream running from one corner to the diagonally opposite corner (arranged as lines). In any case, we would need to know this.

We would also need to know how the color data are coded in the 32-bit integers. And at this point I am tempted to say this might be more efficiently coded as a C or C++ program, just because we would have more control over how the data bytes are turned into 32-bit integers.

Finally, chances are the file has a header, telling us what's in the file. We would need to decode the header as a first step, if only to prevent it getting in the way of an accurate scan of the data.

Conclusion? This is a bit more complex than simply streaming binary into ASCII data.


lutusp 09-17-2009 03:25 AM

Quote:

Originally Posted by idaham (Post 3686504)
Hello again Paul!
Geez, I thought this was probably an easy thing, just a simple command in linux that would do the trick... This goes to show you how little I know about linux! =)
Well, I get the binary files from a kind of simulation program that simulates a human body, called XCAT. The program is precompiled so I'm not sure about how it works. It says in the manual however, that the voxelized phantoms are saved as raw binary files with no header. There's no graphics package in that program, but I've opened it in another medical image software (called amide) as a raw binary file where I had to specify the x,y,z dimensions and type first. It looked really bad for any setting except float 32 little endian so... I'm sorry, I forgot to tell you that I know the dimensions of the 3D image before hand! For this case it's 128,128,600. I'm not sure I can attach a file, cause it's 37MB large. As I mentioned before, I get the correct values when I read it into Matlab (fopen then fread) and specify float32 for fread. I dunno if this helps at all...
I'm sorry if this thread is getting longer and longer! Thanks for taking so much time to help me,
/Ida

A poster named 'Antegallya' posted a good solution yesterday. Today I created one that wasn't nearly as good, so here's a link to the best solution:

This thread, post #2

I have made a small change to the original, just for aesthetics (it assumes unsigned values and produces consistent row lengths):

Code:

$ hexdump -v -e '7/4 "%010u "' -e '"\n"' binary-file-name
But Antegallya's solution is perfectly satisfactory.

idaham 09-17-2009 09:52 AM

Hi again!
I was able to (with some help!) write a C++ script to perform the conversion for me, and write it into a file. Here's the code if anyone is interested or has the same problem!

[CODE]
#include <fstream>
#include <iostream>
#include <stdio.h>
#include <string>
using namespace std;

int main(int argc, char *argv[]) {

if (argc!=2) { printf("Wrong input!\n"); return EXIT_FAILURE; }

ofstream myfile;
char fileNameOut[200];
ifstream f;
float floatName;
int i;
const char * extension = ".dat";
char * str = new char[30];

strncpy(fileNameOut, argv[1], strlen(argv[1])-4);
strcpy (fileNameOut+strlen(argv[1])-4, extension);
myfile.open (fileNameOut);

printf ("\nOpening file \"%s\"...\n", argv[1]);
f.open (argv[1], ios::binary);
if (f==NULL){
printf("Cannot open file!\n\n");
return 1; }
else{
f.seekg (0, ios::beg);
while (! f.eof()){
f.read ( (char*)(&floatName), sizeof(floatName));

sprintf (str, "%f", floatName);

myfile << str << "\n";


}
}
delete [] str;
myfile.close(); f.close();
return 0;
}
[\CODE]
Input to the program is the string with your path to the binary file.
Thanks for all help!
/Ida

lutusp 09-17-2009 10:11 AM

Quote:

Originally Posted by idaham (Post 3687059)
Hi again!
I was able to (with some help!) write a C++ script to perform the conversion for me, and write it into a file. Here's the code if anyone is interested or has the same problem!

[ snip code listing ]

Input to the program is the string with your path to the binary file.
Thanks for all help!
/Ida

You might want to post your program enclosed in code tags so it gets proper indentation -- this makes it readable in the discussion group format.

I have some comments -- one is about this:

Code:

f.read ( (char*)(&floatName), sizeof(floatName));
This implies that the data file contains 4-byte floats, not integers, so the method proposed by 'antegallya' and promoted by me wouldn't have worked (but see below). And I now realize you said very plainly that the file consisted of 4-byte floats in your very first post. :(

This line: --

Code:

f.seekg (0, ios::beg);
-- isn't needed, because the file has just been opened.

Now that I know the file contains 4-byte floats, this method will work also:

Code:

hexdump -v -e '7/4 "%f "' -e '"\n"'
Here is an example using it:

Code:

$ hexdump -v -e '7/4 "%f "' -e '"\n"' < binary. file > text.file
Again, this idea belongs to Antegallya.

This is not to disparage your programming efforts, but it seems there is a simple way to accomplish the same thing.

idaham 09-18-2009 03:19 AM

Hello again!
I've posted the code again in a more readable manner as you suggested! =) And removed the unnecessary line too... Anyway, your solution (and antegallyas) way is very good too, and a lot simpler! Thanks again guys!
Ida

Code:

#include <fstream>
#include <iostream>
#include <stdio.h>
#include <string>
using namespace std;

int main(int argc, char *argv[]) {

if (argc!=2) { printf("Wrong input!\n"); return EXIT_FAILURE; }

ofstream myfile;
char fileNameOut[200];
ifstream f;
float floatName;
int i;
const char * extension = ".dat";
char * str = new char[30];

strncpy(fileNameOut, argv[1], strlen(argv[1])-4);
strcpy (fileNameOut+strlen(argv[1])-4, extension);
myfile.open (fileNameOut);

printf ("\nOpening file \"%s\"...\n", argv[1]);
f.open (argv[1], ios::binary);
if (f==NULL){
printf("Cannot open file!\n\n");
return 1; }
else{
while (! f.eof()){
f.read ( (char*)(&floatName), sizeof(floatName));

sprintf (str, "%f", floatName);

myfile << str << "\n";

}
}
delete [] str;
myfile.close(); f.close();
return 0;
}

Quote:

Originally Posted by lutusp (Post 3687084)
You might want to post your program enclosed in code tags so it gets proper indentation -- this makes it readable in the discussion group format.

I have some comments -- one is about this:

Code:

f.read ( (char*)(&floatName), sizeof(floatName));
This implies that the data file contains 4-byte floats, not integers, so the method proposed by 'antegallya' and promoted by me wouldn't have worked (but see below). And I now realize you said very plainly that the file consisted of 4-byte floats in your very first post. :(

This line: --

Code:

f.seekg (0, ios::beg);
-- isn't needed, because the file has just been opened.

Now that I know the file contains 4-byte floats, this method will work also:

Code:

hexdump -v -e '7/4 "%f "' -e '"\n"'
Here is an example using it:

Code:

$ hexdump -v -e '7/4 "%f "' -e '"\n"' < binary. file > text.file
Again, this idea belongs to Antegallya.

This is not to disparage your programming efforts, but it seems there is a simple way to accomplish the same thing.



All times are GMT -5. The time now is 05:17 AM.