-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
BUG : add __hash__ to AffineBase #2909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
python3 requires that if a class defines __eq__ then in must define __hash__ (so that if a == b then hash(a) == hash(b)). Define 64bit hash by pulling the first 8 characters out of a sha1 hash of the matrix closes matplotlib#2828
# hash by pulling the first 8 characters out of a sha1 hash of the matrix | ||
# see issue #2828 | ||
def __hash__(self): | ||
return int(hashlib.sha1(self.get_matrix()).hexdigest()[:8], 16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a correct approach. But I wonder if it's faster to do:
hash(self.get_matrix().tostring())
(Don't know if it is, just though the code would spend more time in C that way...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed
In [51]: tt = np.random.rand(3, 3)
In [52]: %timeit int(hashlib.sha1(tt).hexdigest()[:8], 16)
1000000 loops, best of 3: 1.58 µs per loop
In [53]: %timeit hash(tt.tostring())
1000000 loops, best of 3: 270 ns per loop
I think this is the right approach, but it is subtlely different than the old Python 2.x behavior -- which is that these objects were hashed on their identity (their pointer in memory essentially), and not on their content. I think this is probably more correct, but we should verify there aren't unintended consequences (though beyond running the test suite, I'm not sure how to anticipate any of those consequences). |
@pelson: The PS backend maintains a dictionary where the keys are On Python 2, objects are always hashable, based on their memory pointer. On Python 3, objects are hashable unless they customize |
However, this opens up an issue that (existed before) that this could be badly abused else where in the code I don't know the transform stack well, but skimming the code It might be better to fix the problem where it is and use |
I think with this PR merged, the net effect will be that |
However it leaves with
With in this context it's ok (as the transforms should probably not be changing on the way out to disk) but is probably not good. |
I'm on my mobile so can't see everything within this PR, but as the transform is mutable it should only implement hash based on the object id and not its actual content. |
@pelson That introduces an inconsistency a different way, in that |
Sorry, I'm not following your notation - if I'm reading this correctly then you are raising concern that an object which is equal has a different hash? If so, that is a perfectly valid definition - anything else would mean we are computing hashes which are mutable, which is asking for trouble. This is also the existing behaviour in python2. |
The inconsistency of having equal objects with different hashes is the reason that py3k requires objects that define |
I am closing this in favor of #2927 |
python3 requires that if a class defines
__eq__
then in must define__hash__
(so that if a == b then hash(a) == hash(b)). Define 64bithash by pulling the first 8 characters out of a sha1 hash of the matrix
closes #2828