Array size in C++
I'm writing an algorithm to do a bit of signal processing. Nothing particularly complicated, just need to be able to handle a large volume of data and funny input file format.
Gist of it is, the huge amount of input is sytematically read a bit at a time a remapped into an array in memory, defined as: Code:
float modl[nz][nx1][nx2]; 1.If it was a memory allocation error I'd expect to recieve a malloc error rather than a Segmentation Fault. 2.Estimating the amount memory the array should require as sizeof(float)*[3001]*[1047]*[1], I find it to be in the 3 Mb ball park. Since I'm running on a machine with 4 Gb of memory this really shouldn't be a problem! Yet, if I decrease the size of the array being defined, it works fine! :( Anyone any ideas? All help appeciated greatly! |
Quote:
Quote:
4 x 3001 x 1047 x 1 = 12,568,188 = 12M (in Microsoft math) As you will see from the post mentioned above, the default stack limit is 8M. |
Thanks David you're a star..but can I beg one more question!
I hadn't realised there was a memory limit on how big an array you can define on the stack, so I'm now using calloc() to assign the memory and fill it with zeroes. When I want to write to that bit of memory, I shouldn't I just do as below? Code:
float *modl = calloc ((nz*nx1*nx2),sizeof(float)); Mark PS. Yeah, maybe I forgot to multiply by 4 for sizeof(float). Doh! |
Quote:
I stuck in some suggestions as comments in your code below: Quote:
|
Quote:
ta0kira |
std::vector anyone?
|
Quote:
|
Quote:
|
Quote:
|
What I'm wondering about is this... do you really have 12 million equal possibilities? Whenever this program runs for a while, is it really likely that every single one of those "[3001]*[1047]*[1]" buckets will be full?
Or could the actual data-distribution be sparse? In the latter case, what if you consider the "address" of the input to be a 3-tuple, where the range of values in this particular case is ([0..3001], [0..1047], [0..1]) but might be somewhat smaller or larger the next time the program is run. So, what you use for the storage in this case is a hash table. The values are stored in the hash-table using a 3-tuple as a hash key. The advantage of this approach is that, as memory is allocated for the hash-table as it grows, that memory will tend to be fairly contiguous even though the distribution of key-values might not. Let's say for instance that you take-in about a hundred-thousand unique numbers in a particular run. The hash-table will have grown to about a hundred-thousand entries, stored in a few more-or-less contiguous megabytes of virtual storage, and it will have done this no matter how widely scattered the key-values might have been. This will prevent the problem that would otherwise bring your program to its knees: thrashing. By pre-determining the location in virtual storage where each value is to be placed, you potentially force the application to suffer a page-fault for every hit, if the key-value distribution is widely scattered. Thrashing can easily destroy your program, causing runs to take many hours or days... that ought to take seconds. Bright(!) Idea... If you have any opportunity to exercise any sort of control over how the data is presented to you (i.e. if it's in a static file rather than coming in real-time), you can radically improve the situation by sorting the values first. Extract them from the funny input file, dump them into a temporary file in an easy-to-use format, and disk-sort that file. In doing so, you just might eliminate the need for "random access" altogether: identical data-points are now adjacent in the file. Gaps can be found by comparing "this record" to "the previous one." And sorting is an unexpectedly-fast algorithm. A 12 or even 120-megabyte file is "no big deal." "That's how they did it with punched cards," and the technique still works today! |
All times are GMT -5. The time now is 12:06 AM. |