Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

But on the ryzen the vram allocation can be entirely dynamically allocated. I saw a review showing excellent full GPU usage during inference with the bios vram allocation set to the minimum level, using a very large model. So it's not so simple as you describe (I used to think this was the case too).

Beyond that, seems like the 395 in practice smashes the dgx spark in inference speeds for most models. I haven't seen nvfp4 comparisons yet and would be very interested to.



Yes you can set it but in the BIOS, not dynamically as you need it.

I dont think there are any models supporting nvfp4 yet but we shall probably start seeing them.


That's what I'm saying, in the review video I saw they allocated as little memory as possible to the GPU in the bios, then used some kind of kernel level dynamic control.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: