LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 09-16-2009, 03:12 AM   #1
idaham
LQ Newbie
 
Registered: Aug 2009
Posts: 26

Rep: Reputation: 0
How to convert binary file to ascii file?


Hello linux experts!
I have a question about binary files. I've googled the problem, but didn't find anything useful so I figured I'd ask here...

I have a binary file (extension .bin) that I would like to convert to ascii somehow. The data in the binary file is 32bit float numbers. I've done the conversion so far by reading the file into matlab as float32, then saving it as ascii resulting in a file with a nice column of numbers. I have a lot of these binary files, so I figured there must be a better way to do this directly in linux! I run centOS 5.3.

Any ideas? Many thanks!
Ida
 
Old 09-16-2009, 05:18 AM   #2
antegallya
Member
 
Registered: Jun 2008
Location: Belgium
Distribution: Debian
Posts: 109

Rep: Reputation: 42
Hello,
have a look at hexdump (man 1 hexdump)
Code:
hexdump -v -e '7/4 "%10d "' -e '"\n"' yourfile
will do what you seem to want (read 4 bytes and print them as one 32bits integer of 10 characters padded with spaces, 7 times per line of output separated by a space so it fits in 80 columns).
 
Old 09-16-2009, 05:55 AM   #3
lutusp
Member
 
Registered: Sep 2009
Distribution: Fedora
Posts: 835

Rep: Reputation: 102Reputation: 102
Quote:
Originally Posted by idaham View Post
Hello linux experts!
I have a question about binary files. I've googled the problem, but didn't find anything useful so I figured I'd ask here...

I have a binary file (extension .bin) that I would like to convert to ascii somehow. The data in the binary file is 32bit float numbers. I've done the conversion so far by reading the file into matlab as float32, then saving it as ascii resulting in a file with a nice column of numbers. I have a lot of these binary files, so I figured there must be a better way to do this directly in linux! I run centOS 5.3.

Any ideas? Many thanks!
Ida
You haven't said what problem this solves. Do you need the data to be in ASCII form so it can be put in an e-mail? That's one of the few remaining reasons to make this conversion. Or do you want it in the form of 32-bit integers instead of raw binary? That's a pretty good reason.

But the solution depends on what the data is. There are plenty of ways to efficiently pack binary data into something that passes muster as ASCII text, but it should answer a real need.

If you want a reasonably efficient solution that creates plain ASCII out of arbitrary binary data and the reverse, do this:

Code:
$ base64 < binary.file > text.file
-- and --

Code:
$ base64 -d < text.file > binary.file
 
1 members found this post helpful.
Old 09-16-2009, 06:42 AM   #4
idaham
LQ Newbie
 
Registered: Aug 2009
Posts: 26

Original Poster
Rep: Reputation: 0
Hello and thanks for your help!
The binary data I'm working with is a 3D image, with different voxel values of course. These values are 32bit floats, and I want to be able to extract these voxel values from the binary file. I want to save the voxel values in a .dat-file with the first line describing the number of voxels in each direction (x,y,z) followed by a long sequence of voxel values from top left corner of image to bottom right, e.g.

128, 128, 50
10
0
103
57
.
.

Cheers,
Ida

Quote:
Originally Posted by lutusp View Post
You haven't said what problem this solves. Do you need the data to be in ASCII form so it can be put in an e-mail? That's one of the few remaining reasons to make this conversion. Or do you want it in the form of 32-bit integers instead of raw binary? That's a pretty good reason.

But the solution depends on what the data is. There are plenty of ways to efficiently pack binary data into something that passes muster as ASCII text, but it should answer a real need.

If you want a reasonably efficient solution that creates plain ASCII out of arbitrary binary data and the reverse, do this:

Code:
$ base64 < binary.file > text.file
-- and --

Code:
$ base64 -d < text.file > binary.file
 
Old 09-16-2009, 01:04 PM   #5
lutusp
Member
 
Registered: Sep 2009
Distribution: Fedora
Posts: 835

Rep: Reputation: 102Reputation: 102
Quote:
Originally Posted by idaham View Post
Hello and thanks for your help!
The binary data I'm working with is a 3D image, with different voxel values of course. These values are 32bit floats, and I want to be able to extract these voxel values from the binary file. I want to save the voxel values in a .dat-file with the first line describing the number of voxels in each direction (x,y,z) followed by a long sequence of voxel values from top left corner of image to bottom right, e.g.

128, 128, 50
10
0
103
57
.
.

Cheers,
Ida
Well, IMHO this should have been your first post, because extracting textual representations of binary data is in my view an admirable goal (more portable over time, easier for a human to read, more easily archivable, etc.).

But ... we would need to know in detail how the data are represented in the binary file. Is it just a long, uninterrupted stream of 32-bit integers with no other structure? Are the integers signed or unsigned? What program created the data and does that program have the ability to read what it writes?

You've given some hints about the nature of the data, but to get the data out of the file I think we would need to know more about it than we do at the moment. Are the integers arranged as vectors (x,y,z) or is the position of the values part of a fixed-dimension spatial cube with no gaps (if the latter, the file will be huge).

It's important to understand that files like this don't generally contain what is in essence a representation of 3D space with no gaps in coverage (because it's so inefficient), in the form of a rectilinear data stream running from one corner to the diagonally opposite corner (arranged as lines). In any case, we would need to know this.

We would also need to know how the color data are coded in the 32-bit integers. And at this point I am tempted to say this might be more efficiently coded as a C or C++ program, just because we would have more control over how the data bytes are turned into 32-bit integers.

Finally, chances are the file has a header, telling us what's in the file. We would need to decode the header as a first step, if only to prevent it getting in the way of an accurate scan of the data.

Conclusion? This is a bit more complex than simply streaming binary into ASCII data.
 
Old 09-17-2009, 02:53 AM   #6
idaham
LQ Newbie
 
Registered: Aug 2009
Posts: 26

Original Poster
Rep: Reputation: 0
Hello again Paul!
Geez, I thought this was probably an easy thing, just a simple command in linux that would do the trick... This goes to show you how little I know about linux! =)
Well, I get the binary files from a kind of simulation program that simulates a human body, called XCAT. The program is precompiled so I'm not sure about how it works. It says in the manual however, that the voxelized phantoms are saved as raw binary files with no header. There's no graphics package in that program, but I've opened it in another medical image software (called amide) as a raw binary file where I had to specify the x,y,z dimensions and type first. It looked really bad for any setting except float 32 little endian so... I'm sorry, I forgot to tell you that I know the dimensions of the 3D image before hand! For this case it's 128,128,600. I'm not sure I can attach a file, cause it's 37MB large. As I mentioned before, I get the correct values when I read it into Matlab (fopen then fread) and specify float32 for fread. I dunno if this helps at all...
I'm sorry if this thread is getting longer and longer! Thanks for taking so much time to help me,
/Ida


Quote:
Originally Posted by lutusp View Post
Well, IMHO this should have been your first post, because extracting textual representations of binary data is in my view an admirable goal (more portable over time, easier for a human to read, more easily archivable, etc.).

But ... we would need to know in detail how the data are represented in the binary file. Is it just a long, uninterrupted stream of 32-bit integers with no other structure? Are the integers signed or unsigned? What program created the data and does that program have the ability to read what it writes?

You've given some hints about the nature of the data, but to get the data out of the file I think we would need to know more about it than we do at the moment. Are the integers arranged as vectors (x,y,z) or is the position of the values part of a fixed-dimension spatial cube with no gaps (if the latter, the file will be huge).

It's important to understand that files like this don't generally contain what is in essence a representation of 3D space with no gaps in coverage (because it's so inefficient), in the form of a rectilinear data stream running from one corner to the diagonally opposite corner (arranged as lines). In any case, we would need to know this.

We would also need to know how the color data are coded in the 32-bit integers. And at this point I am tempted to say this might be more efficiently coded as a C or C++ program, just because we would have more control over how the data bytes are turned into 32-bit integers.

Finally, chances are the file has a header, telling us what's in the file. We would need to decode the header as a first step, if only to prevent it getting in the way of an accurate scan of the data.

Conclusion? This is a bit more complex than simply streaming binary into ASCII data.
 
Old 09-17-2009, 04:25 AM   #7
lutusp
Member
 
Registered: Sep 2009
Distribution: Fedora
Posts: 835

Rep: Reputation: 102Reputation: 102
Quote:
Originally Posted by idaham View Post
Hello again Paul!
Geez, I thought this was probably an easy thing, just a simple command in linux that would do the trick... This goes to show you how little I know about linux! =)
Well, I get the binary files from a kind of simulation program that simulates a human body, called XCAT. The program is precompiled so I'm not sure about how it works. It says in the manual however, that the voxelized phantoms are saved as raw binary files with no header. There's no graphics package in that program, but I've opened it in another medical image software (called amide) as a raw binary file where I had to specify the x,y,z dimensions and type first. It looked really bad for any setting except float 32 little endian so... I'm sorry, I forgot to tell you that I know the dimensions of the 3D image before hand! For this case it's 128,128,600. I'm not sure I can attach a file, cause it's 37MB large. As I mentioned before, I get the correct values when I read it into Matlab (fopen then fread) and specify float32 for fread. I dunno if this helps at all...
I'm sorry if this thread is getting longer and longer! Thanks for taking so much time to help me,
/Ida
A poster named 'Antegallya' posted a good solution yesterday. Today I created one that wasn't nearly as good, so here's a link to the best solution:

This thread, post #2

I have made a small change to the original, just for aesthetics (it assumes unsigned values and produces consistent row lengths):

Code:
$ hexdump -v -e '7/4 "%010u "' -e '"\n"' binary-file-name
But Antegallya's solution is perfectly satisfactory.

Last edited by lutusp; 09-17-2009 at 04:48 AM. Reason: deleted my own solution in favor of anohter
 
Old 09-17-2009, 10:52 AM   #8
idaham
LQ Newbie
 
Registered: Aug 2009
Posts: 26

Original Poster
Rep: Reputation: 0
Hi again!
I was able to (with some help!) write a C++ script to perform the conversion for me, and write it into a file. Here's the code if anyone is interested or has the same problem!

[CODE]
#include <fstream>
#include <iostream>
#include <stdio.h>
#include <string>
using namespace std;

int main(int argc, char *argv[]) {

if (argc!=2) { printf("Wrong input!\n"); return EXIT_FAILURE; }

ofstream myfile;
char fileNameOut[200];
ifstream f;
float floatName;
int i;
const char * extension = ".dat";
char * str = new char[30];

strncpy(fileNameOut, argv[1], strlen(argv[1])-4);
strcpy (fileNameOut+strlen(argv[1])-4, extension);
myfile.open (fileNameOut);

printf ("\nOpening file \"%s\"...\n", argv[1]);
f.open (argv[1], ios::binary);
if (f==NULL){
printf("Cannot open file!\n\n");
return 1; }
else{
f.seekg (0, ios::beg);
while (! f.eof()){
f.read ( (char*)(&floatName), sizeof(floatName));

sprintf (str, "%f", floatName);

myfile << str << "\n";


}
}
delete [] str;
myfile.close(); f.close();
return 0;
}
[\CODE]
Input to the program is the string with your path to the binary file.
Thanks for all help!
/Ida
 
Old 09-17-2009, 11:11 AM   #9
lutusp
Member
 
Registered: Sep 2009
Distribution: Fedora
Posts: 835

Rep: Reputation: 102Reputation: 102
Quote:
Originally Posted by idaham View Post
Hi again!
I was able to (with some help!) write a C++ script to perform the conversion for me, and write it into a file. Here's the code if anyone is interested or has the same problem!

[ snip code listing ]

Input to the program is the string with your path to the binary file.
Thanks for all help!
/Ida
You might want to post your program enclosed in code tags so it gets proper indentation -- this makes it readable in the discussion group format.

I have some comments -- one is about this:

Code:
f.read ( (char*)(&floatName), sizeof(floatName));
This implies that the data file contains 4-byte floats, not integers, so the method proposed by 'antegallya' and promoted by me wouldn't have worked (but see below). And I now realize you said very plainly that the file consisted of 4-byte floats in your very first post.

This line: --

Code:
f.seekg (0, ios::beg);
-- isn't needed, because the file has just been opened.

Now that I know the file contains 4-byte floats, this method will work also:

Code:
hexdump -v -e '7/4 "%f "' -e '"\n"'
Here is an example using it:

Code:
$ hexdump -v -e '7/4 "%f "' -e '"\n"' < binary. file > text.file
Again, this idea belongs to Antegallya.

This is not to disparage your programming efforts, but it seems there is a simple way to accomplish the same thing.
 
Old 09-18-2009, 04:19 AM   #10
idaham
LQ Newbie
 
Registered: Aug 2009
Posts: 26

Original Poster
Rep: Reputation: 0
Hello again!
I've posted the code again in a more readable manner as you suggested! =) And removed the unnecessary line too... Anyway, your solution (and antegallyas) way is very good too, and a lot simpler! Thanks again guys!
Ida

Code:
#include <fstream>
#include <iostream>
#include <stdio.h>
#include <string>
using namespace std;

int main(int argc, char *argv[]) {

if (argc!=2) { printf("Wrong input!\n"); return EXIT_FAILURE; }

ofstream myfile;
char fileNameOut[200];
ifstream f;
float floatName;
int i;
const char * extension = ".dat";
char * str = new char[30];

strncpy(fileNameOut, argv[1], strlen(argv[1])-4);
strcpy (fileNameOut+strlen(argv[1])-4, extension);
myfile.open (fileNameOut);

printf ("\nOpening file \"%s\"...\n", argv[1]);
f.open (argv[1], ios::binary);
if (f==NULL){
printf("Cannot open file!\n\n");
return 1; }
else{
while (! f.eof()){
f.read ( (char*)(&floatName), sizeof(floatName));

sprintf (str, "%f", floatName);

myfile << str << "\n";

}
}
delete [] str;
myfile.close(); f.close();
return 0;
}
Quote:
Originally Posted by lutusp View Post
You might want to post your program enclosed in code tags so it gets proper indentation -- this makes it readable in the discussion group format.

I have some comments -- one is about this:

Code:
f.read ( (char*)(&floatName), sizeof(floatName));
This implies that the data file contains 4-byte floats, not integers, so the method proposed by 'antegallya' and promoted by me wouldn't have worked (but see below). And I now realize you said very plainly that the file consisted of 4-byte floats in your very first post.

This line: --

Code:
f.seekg (0, ios::beg);
-- isn't needed, because the file has just been opened.

Now that I know the file contains 4-byte floats, this method will work also:

Code:
hexdump -v -e '7/4 "%f "' -e '"\n"'
Here is an example using it:

Code:
$ hexdump -v -e '7/4 "%f "' -e '"\n"' < binary. file > text.file
Again, this idea belongs to Antegallya.

This is not to disparage your programming efforts, but it seems there is a simple way to accomplish the same thing.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to scripts file convert to binary file? tanna007 Red Hat 2 03-18-2009 07:56 AM
convert file from UTF8 to ASCII encoding graemef Programming 8 12-15-2008 05:45 AM
Convert ASCII text to an audio file ed_homeLinux Linux - Software 1 07-22-2005 01:30 PM
convert text file to binary excel file ust Linux - General 2 11-23-2004 03:33 AM


All times are GMT -5. The time now is 01:23 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration