This might be wild conspiracy, but what if OpenAI has discovered a way to make these LLMs a lot cheaper than they were? Transformer hype started with the invention of self-attention - perhaps, they have discovered something that beats it so hard, as GPTs beat Markov chains?
They cannot disclose anything, since it would make it apparent that GPT-4 cannot have a number of parameters that low, or the gradients would have faded out on the network that deep, and so on.
They don't want any competition, obviously, but with their recent write-up on "mitigating disinformation risks", where they propose to ban non-governmental consumers from having GPUs at all (as if regular Joe could just run 100'000 A100s in his garage), so perhaps this means the lowest border for inference and training is a lot lower than we have thought and assumed?
Or you could have CQRS projectors (read models), which solve exactly this - they aggregate data from lots of different eventually consistent sources, providing you with locally consistent view only of events you might be interested in.
It will lag behind by some extent, roughly equal to the processing delay + double network delay, but can include arbitrary things that are part of your event model.
Though, it's not a silver bullet (distributed constraints are pain in the ass yet), and if system wasn't designed as DDD/CQRS system from the ground up, it would be hard to migrate it, especially because you can't make small steps toward it.