Are you sure the translators don't insert code necessary to maintain ordering? I would be shocked if most threaded code works when you throw out the x86 memory model. Managed runtimes like .NET definitely generate code for each target designed to maintain the correct memory model.
> You can also select multi-core settings, as shown here... These settings change the number of memory barriers used to synchronize memory accesses between cores in apps during emulation. Fast is the default mode, but the strict and very strict options will increase the number of barriers. This slows down the app, but reduces the risk of app errors. The single-core option removes all barriers but forces all app threads to run on a single core.
zamadatix's interprets this as Microsoft saying that by default, Windows on ARM runs x86 apps without x86 TSO, and turns on extra memory barriers using per-app compatibility settings. But if an app needs TSO but isn't in Windows's database, it will crash or silently corrupt data.
Sure, but Windows on ARM has to run on many ARM processors, not a specific one designed by MS. They could detect if the processor has non-standard TSO support and use that when running an x86 app, but they still have to do something to run the x86 app on a standard ARM processor.
emulator would have a zero issue, if it's a direct transfer for assembly (not an emulator), it'd need either hardware support - e.g. apple chips, or memory barriers.
The differences between arm and x86 are known for 15y+, there is nothing new about it. Also concurrency support is one of the major benefits of languages with proper memory model - java started it with JMM[0]
The differences are very known, it's just still an open problem how to address running code for a stronger memory model on systems with a weaker one at as high as possible performance performance without explicit hardware support (like Apple's choice of a TSO config bit). Your compiled binary for TSO has erased any LoadLoad, LoadStore, and StoreStore barriers, and the emulator has to divine them. The heuristics there are still fraught with peril.
The JVM absolutely did some great work walking this path, both in defining a memory model in the first place, and supporting that model on weak and strong hardware memory models, but the JMM was specifically designed to be able to run cleanly on WMO platforms to begin with (early Sparc), so they don't face a lot of the same problems discussed here.
Any emulator that wants to be remotely performance competitive will do dynamic translation (i.e JIT). In fact ahead-of-time translation is not really feasible.
Memory models and JVM are not really relevant when discussing running binaries for a different architecture.
The memory models are relevant as the translation/JIT/whatever has to take them in consideration. TSO is well known and it's also well known how arm weaker memory model needs memory barriers to emulate TSO.
If there is a JIT I'd expect to be able to add a read barriers, on memory location allocated by another thread - incl. allocating bits in the pointers and masking them off on each dereference. If any block appears to be shared - the code that allocated it would need to be recompiled with memory store-store barriers; the reading part would need load-load and so on. There are quite a few ways to deal with the case, aside the obvious - make the hardware compatible.
If in end it's not an easy feat to make up for the stronger memory model correctness, yet correctness should be a prime goal of an 'emulator'
It is relevant as an example of how to write a JIT for a specific memory model that will run on a different architecture with a different memory model. AKA it is a known issue that has been successfully dealt with for quite a while now.
Ah well, back in the day, prior JMM, there was no notion of memory models at all, and no language had one; Hence, I referenced the original paper that started it all. The point was that it happened long time ago, and there is nothing new about the current case.