LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-03-2006, 07:59 AM   #1
sibtay
Member
 
Registered: Aug 2004
Location: U.S
Distribution: Ubuntu
Posts: 145

Rep: Reputation: 15
Serializing C structures


Hi

I am starting to work on serializing structures in C. Does anybody know any related resources on net?.

I know for a matter of fact that a C structure can be written on a file and then
re-read successfully at some later stage (excluding the pointers issue for now).

However i am thinking about developing a generic framework for this purpose. Which requires consideration of a number things for example the varied behaviour of writing structures on disk on different platforms etc.

Who knows this thing may go on to become a small open source project

Anybody interested or capable of helping please comment/participate.

Thanks
 
Old 05-04-2006, 03:59 AM   #2
Hko
Senior Member
 
Registered: Aug 2002
Location: Groningen, The Netherlands
Distribution: Debian
Posts: 2,536

Rep: Reputation: 111Reputation: 111
I would not be surprised if such a library already exists somewhere. But I do like like the idea.

How about storing it architecture-independant in XML?
 
Old 05-04-2006, 06:38 AM   #3
ppanyam
Member
 
Registered: Oct 2004
Location: India
Distribution: Redhat
Posts: 88

Rep: Reputation: 15
When you write a structure in binary to disk, you are 'serializing' the whole structure to disk. You can anytime read the structure fromthe disk. This is fairly basic.. Or am I missing something in what you are saying?
 
Old 05-04-2006, 07:23 AM   #4
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Rep: Reputation: 148Reputation: 148
Quote:
Originally Posted by ppanyam
When you write a structure in binary to disk, you are 'serializing' the whole structure to disk. You can anytime read the structure fromthe disk. This is fairly basic.. Or am I missing something in what you are saying?
A generic approach is slightly more involved than just writing the structure to disk. Problems that will need to be overcome include:
  • The Endianness of the machine
  • Expanding pointers
  • Avoiding cyclic loops whilst following pointers
This means that knowledge of the structure is important, which is where xml could step in to provide that in a portable format.
 
Old 05-04-2006, 07:24 AM   #5
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
I would suggest you read Chapter 5 of Eric S Raymond's "The Art of Unix Programming".

If possible, the output should be texual. Remember that different processors will save and read binary data differently. Some use big endian, others use little endian. Some use 32 bit integers and other use 64 bit. The same program compiled on different machines might produce non-portable files.

The format that works best depends on the type of information being saved. If you can use a simple Windows 3.1 INI format that may be fine for your application. For structured records consider how the /etc/password file is structured. XML may be the choice for complicated structures, but it is hard to work with using the standard textual tools such as grep, sed and awk. Also, using it could make a program unneccesarily bulky. Open Office uses it, but it is already bulky.

Last edited by jschiwal; 05-04-2006 at 07:38 AM.
 
Old 05-04-2006, 08:55 AM   #6
sibtay
Member
 
Registered: Aug 2004
Location: U.S
Distribution: Ubuntu
Posts: 145

Original Poster
Rep: Reputation: 15
great input guys, thanks

I like Hko's idea of making the serialization artchitecture independent. This can be a very cool medium/long term target. However initially we have to
deal with problems highlighted by graemef.

Personally i would like to deal with the problem of "Expanding Pointers". This also include multiple sub-problems like handling indirections (for e.g what to do with an int********* ?).

Serialization of PODs is not an issue at all. As we all know a structure containing POD data types can simply be written to the file and then read at some later
stage. But when a structure contains a pointer data type, the data written to the file would be the "address" which that pointer contains, which of no use to us.

I am working on this issue besides my busy schedule at work. Hence the progress would be a little slow. However i'll post on this forum as soon as i find something new and request the same from you guys.

If someone can find relevant reading material on the net, plz post the link here.

Thanks,
Sibtay
 
Old 05-04-2006, 03:25 PM   #7
aluser
Member
 
Registered: Mar 2004
Location: Massachusetts
Distribution: Debian
Posts: 557

Rep: Reputation: 43
Quote:
Originally Posted by sibtay
Serialization of PODs is not an issue at all. As we all know a structure containing POD data types can simply be written to the file and then read at some laterstage.
I'm not sure we all know this.
  • endianness
  • length
  • newlines
  • signedness
  • alignment
  • padding

char is the only type I can think of that doesn't have cross-platform problems when writing a structure to disk, and actually it kind of has problems too

Or are ints and chars not what you meant by "POD"? I don't see the acronym a lot but guess it means "plain old datatype"
 
Old 05-04-2006, 04:31 PM   #8
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Rep: Reputation: 148Reputation: 148
I think that the idea is that a int would be saved in an xml style format (hence a character string) as follows:
<int>46</int>
whilst a double might be saved as:
<double>3.1415</double>

This addresses many of the internal problems of how data is stored. But it does mean that it is important to be able to find out what the datatype is and that is not trivial just consider the difference between char *, char [] and char[23] How do you identify the three apart?
 
Old 05-05-2006, 03:40 AM   #9
sibtay
Member
 
Registered: Aug 2004
Location: U.S
Distribution: Ubuntu
Posts: 145

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by aluser
I'm not sure we all know this.
  • endianness
  • length
  • newlines
  • signedness
  • alignment
  • padding

char is the only type I can think of that doesn't have cross-platform problems when writing a structure to disk, and actually it kind of has problems too

Or are ints and chars not what you meant by "POD"? I don't see the acronym a lot but guess it means "plain old datatype"
Rephrasing by earlier statement:

Serialization of PODs is not an issue at all. Limiting the implementation to a single platform for now, we all know that a structure containing POD data types can simply be written to the file and then read at some laterstage.
 
Old 05-05-2006, 07:32 AM   #10
ioerror
Member
 
Registered: Sep 2005
Location: Old Blighty
Distribution: Slackware, NetBSD
Posts: 536

Rep: Reputation: 34
Quote:
Serialization of PODs is not an issue at all. Limiting the implementation to a single platform for now, we all know that a structure containing POD data types can simply be written to the file and then read at some laterstage.
Of course it's an issue, even on a single system. Programs compiled with different compiler options/optimizations might use different alignment etc.

I suggest re-reading jschiwal's post several times. All output should be textual without a very good reason to do otherwise.

A generic serialization library sounds pretty inefficient to me. An alternative would be to have some sort of program/script which would create the i/o functions for each structure at compile time. Have a look at the src for Freeciv, they have a python script which does exactly this.
 
Old 05-05-2006, 07:51 AM   #11
sibtay
Member
 
Registered: Aug 2004
Location: U.S
Distribution: Ubuntu
Posts: 145

Original Poster
Rep: Reputation: 15
by single platform i meant a single compiler, os, hardware etc etc.

Textual output is one good alternative but it may prove expensive as compared to binary output.

Data written in the form of text *has* to be converted back to its binary form during reading. Whereas if data is written as binary you dont have to face this overhead.

Quote:
A generic serialization library sounds pretty inefficient to me
Again with text vs binary .... the text version would be inefficient.

Quote:
An alternative would be to have some sort of program/script which would create the i/o functions for each structure at compile time. Have a look at the src for Freeciv, they have a python script which does exactly this.
Thanks, i'll check it out.

Currently i am not for/against any approach. Just considering the merits/demerits of all possible approaches.

regards,
Sibtay
 
Old 05-05-2006, 08:46 AM   #12
ioerror
Member
 
Registered: Sep 2005
Location: Old Blighty
Distribution: Slackware, NetBSD
Posts: 536

Rep: Reputation: 34
Quote:
Textual output is one good alternative but it may prove expensive as compared to binary output.

Data written in the form of text *has* to be converted back to its binary form during reading. Whereas if data is written as binary you dont have to face this overhead.
Efficiency isn't the point with text files. Data files should preferably be human readable whenever possible (so that a person doesn't need special tools to read them), though of course, text files are not appropriate for everything.

What I meant was, it would be inefficient compared to a custom coded function that writes binary output (I should have been more specific).

Quote:
Currently i am not for/against any approach. Just considering the merits/demerits of all possible approaches.
Indeed. My objections were just off the top of my head, I haven't investigated the idea in any great depth.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
C structures exvor Programming 8 03-04-2006 12:41 AM
regarding structures eshwar_ind Programming 2 04-25-2005 09:18 AM
Structures AMMullan Programming 6 02-18-2004 11:39 AM
Nested structures :S? alitrix Programming 11 11-15-2003 07:13 PM
C and arrayed structures.. miguetoo Programming 9 05-22-2003 06:30 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:58 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration