LinuxQuestions.org - c++ containers

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - c++ containers (https://www.linuxquestions.org/questions/programming-9/c-containers-777370/)

currently I have

Code:

char charr[40][80];

the data will be constantly changing.

I am wondering if a container would better suit this data.

From what I gather, a map container looks like it may work better than a vector, although a have not used maps. How would I make the key based on 2 values (X and Y). Would this be done with a 2D map?

...and furthermore I know vectors can be made 2D, but is this good prog. practice?, or is there a better way to represent multi-dimensional data?

Well I don't know much about practices, but I think if your application is such that you can easily supply indices to access data then array should be faster than map. You won't need a vector until you want to change size of vector on runtime.

Quote:

I know vectors can be made 2D, but is this good prog. practice?, or is there a better way to represent multi-dimensional data?

It is quite reasonable to have a vector of vectors, ie:

Code:

std::vector<std::vector<char> > charr;

Quote:

I am wondering if a container would better suit this data.

In C++, it is better practice to use an STL container. They have the advantage that they can be dynamically allocated, and vectors are only slightly slower than using the traditional array type. They are also more suitable where the array is a class member.

Quote:

From what I gather, a map container looks like it may work better than a vector, although I have not used maps. How would I make the key based on 2 values (X and Y). Would this be done with a 2D map?

You could consider a map if your array is sparse (ie, most of the values are not set). A map is significantly slower than an array, and also has memory overhead (if every element is set). Maps are normally only used if the keys/indices are unevenly distributed.

If you were to implement it as a map, it could be done in two ways:

Implement it as a one dimensional map, and combine the x and y coordinates as the key, for example, (x<<16)|y
Implement it as a map of maps (or vector of maps), eg

Code:

std::map<size_t, std::map<size_t, char> > charr;

There are other container types too; deciding which to use depends a lot on how the data is to be used. One of the nice features of the STL containers is that they have similar interfaces, so changing from one to another is often simple.

Quote:

Originally Posted by vendtagain (Post 3800813)

Code:

char charr[40][80];

the data will be constantly changing.

What does "data" mean in that phrase?

If it is just the contents that will be changing, the C Array you already have is the best structure for storing that data.

If the 40 and/or the 80 are subject to change, then you may be better off with something based on std::vector (such as the obvious solution, a vector of vectors).

Quote:

I am wondering if a container would better suit this data.

Only if there are requirements you didn't explain that make a container better suited.

Quote:

From what I gather, a map container looks like it may work better than a vector

What makes you think so?

If the 40 by 80 were much bigger and the use were sparse, then a map might be better. Otherwise, a map adds needless complexity and overhead.

Quote:

How would I make the key based on 2 values (X and Y).

You could have a map of maps. Usually that isn't a good solution, but it is simple to code and the excess overhead only matters if you care about performance (in most coding, you don't need to care much about performance).

The more general/professional approach (assuming a map is a good idea at all) is to define an object to contain the two parts of the key and define a comparison operator for that object and use that as the key to the map.

Quote:

...and furthermore I know vectors can be made 2D, but is this good prog. practice?, or is there a better way to represent multi-dimensional data?

It all depends on the nature of the multi-dimensional data.

A vector of vectors makes most sense when the data is fairly dense, both dimensions vary (can't be predicted at initial allocation time) and the dimensions are "ragged" meaning the second dimension is not consistent across the first dimension.

As johnsfine said, a vector of vectors is good for dense data and ragged dimensions. For strictly rectangular dimensions, there's Boost MultiArray.

Quote:

In C++, it is better practice to use an STL container.

This is why I have been looking at containers as opposed to an array.

Quote:

...and furthermore I know vectors can be made 2D, but is this good prog. practice?, or is there a better way to represent multi-dimensional data?

It all depends on the nature of the multi-dimensional data.

1.The maximum indexes will not be changing (x and y will always between 0 and n), and will not be "ragged"(it will be rectangular)
2.Many of the values in each specific index will be uninitialized or unimportant.
3.Values will be quickly changing(initialized, uninitialized, changing values), and will be scattered throughout.
4.The "important" data would start out small, and gradually increase over time.

Quote:

From what I gather, a map container looks like it may work better than a vector

What makes you think so?

#2 is the reason I though a map might work well

Quote:

If the 40 by 80 were much bigger and the use were sparse, then a map might be better. Otherwise, a map adds needless complexity and overhead.

How big would a database be for a map to come into play?

If this databank was very large, seems there would be a point where the "important" data would be better represented be a different container. A memory vs. speed.

for example: data starts out small and sparse across the array, but as time goes by, the data size increases (but is still sparse)...while it is small, a map would work well enough, but as it gets larger a vector would be quicker) Is this correct? and how would I know where this line would be drawn?

Quote:

Originally Posted by vendtagain (Post 3801626)

2.Many of the values in each specific index will be uninitialized or unimportant.

So it is sparse, but how sparse? Your data type is char, so it needs to be very sparse for a sparse storage method to be useful. If 5% were used and 95% were uninitialized, that still wouldn't be spares enough for std::map to be effective.

If it is much sparser than that, the choice of map vs. dense storage might be an interesting time/space tradeoff. But at 5% filled or more there isn't even a time/space tradeoff; a map is just worse.

Quote:

3.Values will be quickly changing(initialized, uninitialized, changing values), and will be scattered throughout.

If they change from used to unused and if that change must be reflected in the storage to keep it sparse, then that fact adds to the time overhead of a map.

Quote:

4.The "important" data would start out small, and gradually increase over time.

Usually it is best to design the storage based on the heavier load, so answer the "how sparse it is" question at near max usage, not at the beginning.

Quote:

How big would a database be for a map to come into play?

For char's I wouldn't even think about a map unless there were tens of millions (or more) positions and less than 2% were filled.

If you know the size at compile time, a C array is simplest and best.

If you know the size at allocation time, a C array has some performance advantage over a vector, but maybe not enough to care about.

If you find out the size later, a vector is a lot simpler than a C array.

Quote:

Originally Posted by neonsignal (Post 3801308)

In C++, it is better practice to use an STL container.

It is better practice to use the right structure for each problem.

Many people I work with routinely use STL containers whenever they work, without thought to whether the use is appropriate. That often causes performance problems and sometimes even makes the code less readable.

One co worker asked me for performance tuning help on some problem code. I immediately noticed a vector of vectors of vectors, but only one of the three dimensions was unknown at compile time. Fixing just that detail fixed the performance problem, but it also reduced the number of lines of code and the length of many lines, making it possible to really see what the code did.