ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a nice algorithm in PERL that uses hashes and works fine. But it uses way too much memory. Thats why I want to do the same in C. Searching for an implementation of PERL hashes in C, the best I found was the C++ STL map container. I would have prefered pure C, but maybe hash problems are easier to solve in C++.
I want to base my program on the following example code from the web. This example just creates, fills and then prints a hash.
But my problem is now that I have to sort the hash by value (not by key!). I tried this with the sort command, but this doesn't work and just gives a mass of errors when compiling. Anybody know whats wrong about this sort command I inserted in the otherwise correctly functioning example?
#include <iostream>
#include <map>
#include <string>
using namespace std;
int main()
{
typedef map<string,float> StringFloatMap;
StringFloatMap coll;
// insert some elements into the collection
coll["VAT"] = 0.15;
coll["Pi"] = 3.1415;
coll["an arbitrary number"] = 4983.223;
coll["Null"] = 0;
StringFloatMap::iterator pos;
/* THIS IS THE COMMAND THAN DOES NOT WORK !!!!! */
sort(coll.begin(),coll.end());
/* ------*/
You can't std::sort a map. It is by definition sorted by key.
In addition, if, for whatever reason, you want a hashed map, use hash_map.
I am not entirely sure what to suggest for you purposes -
maybe create another map, reversing the key and value,
or maybe just a std::set using the values,
or maybe create a class containing both your key and value and then use a std::vector and define two comparison functors to use with std::sort...
I am a lttle rusty on this stuff but if you describe your requirements a bit more someone else might come up with some suitable ideas.
thanks for your reply! ok, I could use hash_map, this looks also good, but I don't know the difference to the other map...
What I want to do is the following:
I have a protein sequence, that mean a string of characters. e.g. MAGRTGTGTGTGAAA
Now I want to find repeats in this string. To do this, I create hashes of all possible and reasonable substrings like e.g. "MA", "AG", "GR", "TG","GRTG","TGAAA". For each occurence of the same substring I increment the corresponding hash. So the "GT" hast would have the value 3 and so on. Because I'm interested mostly in the most frequent repeats, it would be the most efficient to sort the hash by value now, so I can interate and get "GT"=3 for this string as the first element. I did this in PERL and it works fine, but because it is now memory efficient, I want to do this in C. But how can I sort the hash_map by its value?
You just have a trade off here between various features.
My first instinct is to say that in comparison to searching a DNA sequence, ordering the results is going to have a fairly small overhead, but that depends on the number of sequences you ae looking for - it may be 10, but it may be 10,000. I know of one friend who was doing something similar and they were only looking for a handful of sequences in sequences of hundrends of millions codon, so creating another map would carry almost no overhead in comparison. (Bear in mind it might, although unlikely need to be a multimap as you could have the same number of two sequences.)
Code:
map<string, int> found;
// search and increment
multimap<int,string> freq;
for (iter = found.begin() to found.end())
freq[iter->second] = iter->first;
On the other hand, with large numbers of sequences you might just want to index each with a number in a std::vector (much as you are doing now but in a map) and create a vector of "sequence_counts" and a map for quick key lookup.
e.g. some rough pseudo-code
Code:
struct seq_count
{
string name;
int count;
seq_count(string s) : name(s), count(0) {}
// overload operator< to call sort with seq_count's
bool operator< (const seq_count& rhs)
{ return count < rhs.count; }
};
int main()
{
map<string,int> seq_table;
vector<seq_count> histogram;
// you might want to call reserve or vector ctor using the number of sequences
for (int i=0 to all sequences)
{
seq_table["seq_str"] = i;
histogram[i].name = "seq_str";
}
// search gene
histogram[seq_table["found_str"]].count++;
// sort the sequernces by value
sort(histogram.begin(), histogram.end());
}
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.