- #1
lpetrich
- 988
- 178
Associative Arrays in C/C++ -- Favorite Strategies?
I mean by that something like what Perl or Python or Ruby have, a.k.a. dictionary types. I can do it in Mathematica with the help of label[key] = value and Head and DownValues, but it's C/C++ where it is the most difficult of all.
I once had the task of searching an array with elements {params, data} -- find data for some params value. I used a speedup that consisted of making a hash table of
(hash value for params) -> (index in data array)
I got around the hash-collision problem by checking the element at the data-array index. If it has the right params value, then I've succeeded, but if not, then I'd search for params in the data array and update the hash table with the index that I found.
Most recently, in my Lie-algebra code, I've used the C++ Standard Template Library container "map". I wanted to see if some calculated root vector was already in some list of them, so I created a map object that contains the listed root vectors. Searching in the map object requires doing a compare operation, but I found a way to speed it up relative to an element-by-element compare.
When I add a root vector to the map object, I calculate and include a hash code for it. When it comes time to compare, I first compare the hash codes, then if they are equal, I do an element-by-element comparison. This strategy produces a rather noticeable speedup.
In summary, do a fast lookup or compare, and if that fails or returns equal, then do a slow but accurate one.
Are these reasonable strategies?
I mean by that something like what Perl or Python or Ruby have, a.k.a. dictionary types. I can do it in Mathematica with the help of label[key] = value and Head and DownValues, but it's C/C++ where it is the most difficult of all.
I once had the task of searching an array with elements {params, data} -- find data for some params value. I used a speedup that consisted of making a hash table of
(hash value for params) -> (index in data array)
I got around the hash-collision problem by checking the element at the data-array index. If it has the right params value, then I've succeeded, but if not, then I'd search for params in the data array and update the hash table with the index that I found.
Most recently, in my Lie-algebra code, I've used the C++ Standard Template Library container "map". I wanted to see if some calculated root vector was already in some list of them, so I created a map object that contains the listed root vectors. Searching in the map object requires doing a compare operation, but I found a way to speed it up relative to an element-by-element compare.
When I add a root vector to the map object, I calculate and include a hash code for it. When it comes time to compare, I first compare the hash codes, then if they are equal, I do an element-by-element comparison. This strategy produces a rather noticeable speedup.
In summary, do a fast lookup or compare, and if that fails or returns equal, then do a slow but accurate one.
Are these reasonable strategies?