In this little series of posts (probably two three), I’m going to address a problem that has happened to me yesterday and was a real blocker in my algorithm : how to get a hash table (HashSet or Dictionary) to work with keys that change their hash code over time.
I really like hash tables.
Looking up inside them is fast, they guarantee that the items or keys (respectively in a HashSet and in a Dictionary) are a set of unique elements, and implementing GetHashCode() and Equals() makes classes feel complete, like they tell everyone “to identify me, this subset of attributes is all you need to know”.
But by design, they make the assumption that every key (or item in a HashSet) has a hash code that does not change over time. In other words, once you add a key to a Dictionary, you can’t modify what uniquely identifies that key, or the entry in the Dictionary becomes unreachable. And if you ask the Dictionary if it contains your object, if it can’t find it, it’ll answer “nope”.
This may sound like not much of a problem. In fact, the default implementation of GetHashCode() uses something like the address of the object reference, so it just can’t change during the lifetime of the object.
But when you override it, you’ll usually XOR the identifying attributes or associations of your class. And sometimes, those can change; so the next time GetHashCode() will be called, a different value will be returned and all the hash tables you’re part of will be confused.
Thing is, HashSets and Dictionaries have no way of knowing when its elements’ hash code changes.
But what if the objects themselves knew when their identifying attributes change, and thus that the hash code will change too? And what if they could tell the hash-based collections they’re part of, “please re-hash me”?
I actually have a working implementation of this.
I’ll make a proper test case with a scenario where the problem occurs, and post it in part deux… Stay tuned!
Edit : Part two is here.