I also question the logical premise offered here that data structures must be thread-safe. It's not something guaranteed by most languages (eg. Java), so why should Python guarantee it?
"all datastructures would have to be thread-safe is pretty common argument used against usage of native threads (as opposed to green threads) or removal of global locks in various VMs. While it's true that they don't have to be thread-safe from user's point of view, they should be thread-safe in the sense that they cannot get corrupted by race condition in user code in a way that could crash (or deadlock) the VM.
Every Python object has a backing dictionary, so if any object has to be shared between threads in a threadsafe fashion, then dictionaries need to be threadsafe.
It's perfectly possible and even reasonable to call __hash__ before acquiring lock on the dictionary (and cache keys' hashes for data in the dictionary).
On the other hand, when almost every object is backed by dictionary, the locking has to be very fast, which seems to me like almost unsolvable problem.
Right. I think that backing dictionary could probably be replaced with a more thread-friendly implementation, even if we have to offer reduced guarantees.
That's an argument for revamping the backing dictionary, really. If __hash__() is a problem, make it return a thread-safe proxy that delegates to the object's dictionary.
I also question the logical premise offered here that data structures must be thread-safe. It's not something guaranteed by most languages (eg. Java), so why should Python guarantee it?