Hash tables and mutable keys, final

I am really, really sick of seeing the preceding post and its cliffhanger on my blog’s front page, so here’s an abrupt end to it.

There is no sample/code, there won’t be any sample. But I do have one conclusion : If your algorithm needs a hash table with mutable keys, then don’t mutate their hash code.

There is absolutely no good reason for a hash set to have keys that get lost, and no point in hacking the data structure to trace all those swaps and recover from them.

Why do I say this after two lengthy posts about how to circumvent the problem? Simple. A hash set should be, first and foremost, a set. And sets are built upon the idea that all its elements are unique and there are no possible duplicates. But if an element can mutate enough to modify what uniquely identifies it, the operation potentially breaks this assumption. How can you know if your set has unique elements if the elements keep changing, and you can’t keep them from changing?

This leads me to the inevitable (yet predictable) conclusion that mutating keys is not a good idea. In my tests and situations, and I’ve tried for weeks to find a test case where sets mutable keys are absolutely needed, there is always a way around it, and it’s safer, it makes more sense.

So next time you see exceptions or problems of that nature (e.g. unreachable elements), think for a minute. Do you really need mutable keys? You probably don’t, but what if you do? Just don’t override GetHashCode() and move to something else.

There, problem solved. Moving on. :)

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.