Best Data Structure to store data in C
I need to load thousands of records in a text file into memory. The record comprises of two fields: one is number and other is string
Eg: 8914,BHR 9918,ORS Which is the best data structure to hold this data? The operations performed on this data would be: 1) Exhaustive Searching 2) Limited addition, addition of 5-10 records a day. Also, what will be the best technique to insert new record dynamically in this scenario without being restarting the program? Currently i am using IPC (Linux Message Queues) and Signals to indicate the C binary for availability of new records. |
A dynamic array sounds about right:
http://developer.gnome.org/glib/stable/glib-Arrays.html You probably already have GLib. Lyle. |
What type of searching are you doing? I suggest a balanced binary tree sorted based on either of the columns, depending on which is more important. Alternatively, you could represent each as a pair (e.g. struct pair { int val; char name[32]; };) and have two balanced binary trees, each sorted based on one of the columns, but holding a pointer to a struct pair (both trees representing a different ordering of the same data.) This is really only helpful if sorting order will help with your search, though.
Kevin Barry |
Quote:
Quote:
For exact match searching, a hash is best. If ordering matters or for partial matches on the beginning of a field, a tree structure such as red black or AVL is likely best. To search on either key, you might want a hash or tree for each. For more complicated fast partial match searching you might want a more complicated index. Quote:
So if "Exhaustive Searching" was really true, that could justify some serious performance consideration. But I think you might be talking about a problem so small that performance doesn't matter. |
Maybe you should consider the use of a RDB such as sqlite, which will do efficient searching for you, and allow you to focus on the business logic of your application, rather than the minutia of small data structures. This can potentially add value to your data by making it accessible through other applications, and provide relatively easy methods for complex search criteria. Creation, retrieval, update and deletion (crud) of data comes pretty much for free with an SQL database tool.
--- rod. |
What you want to minimize, in this "in memory" (sic...) data structure, is exactly one thing: virtual memory page-faults. Don't try to "save memory." Minimize the number of storage accesses required to locate a particular value.
But the previous suggestion, to throw the whole thing into an SQLite database file, is an excellent and superior suggestion. |
Quote:
Quote:
Quote:
I asked for good technique to inform a running C binary about availability of new records not for inserting new records. As i told earlier even, i am using signals and Message queues for same currently. So, i just want to confirm if this method is ok or something better can be use. And yes you are right, this problem is not that big. Currently i have thousands of data and this may extend to lakhs in coming time. So, should i continue using arrays or should i use dynamic data structure like hash. Please suggest. Thanks in advance. |
Hmmm, well according to https://secure.wikimedia.org/wikipedia/en/wiki/Lakh = 100,000 in Western notation, so for data as per OP, that should all fit in mem; they're not very big fields.
Assuming that the 1st field is ALWAYS unique, then a hash would be good for fast searching as described, OTOH SQLLite would be easier to interface with generally. If its always just a few new recs per day, then your current method of notification would work, but again using SQL would enable you to use other tools & be able to easily export as well. |
All times are GMT -5. The time now is 01:04 AM. |