What I really wanted to know if OpenAI(and other labs for that matter) actually use their own products and not just casually but make LLM a core of how they operate. For example: using LLM for coding in prod, training/fine-tuning internal models for aligning on the latest updates, finding answer etc. Do they put their money where their mouth is, do LLMs help with productivity? There is no mention of it in the article, so I guess they don't?
I don’t know, but I’d guess they are using them heavily, though in a piecemeal fashion.
As impressive as LLMs can be at one-shotting certain kinds of tasks, working in a sprawling production codebase like the one described with tight performance constraints, subtle interdependencies, cross-cutting architectural concerns, etc. still requires a human driving most of the time. LLMs help a lot for this kind of work, but the human is either carefully assimilating their output or carefully choosing spots where (with detailed prompts) they can generate usable code directly.
Again, just a guess, but this my impression of how experienced engineers (including myself) are using LLMs in big/nontrivial codebases, and I’ve seen no indication that engineering processes at the labs are much different from the wider industry.