> Non-temporal instructions don't have anything to do with correctness. They are for cache management; a non-temporal write is a hint to the cache system that you don't expect to read this data (well, address) back soon
I disagree with this statement (taken at face value, I don't necessarily agree with the wording in the OP either). Non-temporal instructions are unordered with respect to normal memory operations, so without a _mm_sfence() after doing your non-temporal writes you're going to get nasty hardware UB.
That is something I can agree with, but I can't in good faith just let "it's just a hint, they don't have anything to do with correctness" stand unchallenged.
You mean if you access it from a different core? I believe that within the same core, you still have the normal ordering, but indeed, non-temporal writes don't have an implicit write fence after them like x86 stores normally do.
In any case, if so they are potentially _less_ correct; they never help you.
Do you have any Intel references for it? I mean, Rust has its own memory model and it will not always give the same guarantees as when writing assembler.
“Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with
the SFENCE or MFENCE instruction should be used in conjunction with VMOVNTDQ instructions if multiple processors might use different memory types to read/write the destination memory locations”
I disagree with this statement (taken at face value, I don't necessarily agree with the wording in the OP either). Non-temporal instructions are unordered with respect to normal memory operations, so without a _mm_sfence() after doing your non-temporal writes you're going to get nasty hardware UB.