Hmm, but supposing the accelerated NVIDIA specific inference data types were ava...

qeternity · 2025-08-23T16:25:30 1755966330

Second line of the post:

> The main objective is to learn writing attention in CUDA C++, since many features are not available in Triton, such as MXFP8 / NVFP4 MMA for sm120.

doctorpangloss · 2025-08-23T23:13:37 1755990817

Yes… I read it. If the feature is missing, why not contribute it instead?

almostgotcaught · 2025-08-23T23:16:08 1755990968

How many PRs do you have landed in Triton that you can just blithely say "contribute it"?

saagarjha · 2025-08-24T02:35:47 1756002947

I mean, you can look at the most recent commit and see that the infrastructure is being built out for this right now (of course OpenAI doesn't care about sm_120, though).

almostgotcaught · 2025-08-24T02:42:38 1756003358

i don't know what this comment has to do with my point that OAI doesn't take commits from randoms, especially for infra code.

doctorpangloss · 2025-08-24T03:11:41 1756005101

By all means, the guy could have written the triton fixes he needs and NOT sent it up stream. It would still make more sense to do that! He’s obviously an expert, and I was sincerely wondering, why bother with the C++ stuff if he already knew the better way, and also has the chops to implement it?

almostgotcaught · 2025-08-24T04:00:02 1756008002

There's an enormous difference between writing kernels and writing compiler infra.

saagarjha · 2025-08-24T03:03:52 1756004632

Yeah they do