Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

`model.safetensors` for Qwen3-0.6B is a single 1.5GB file.

Qwen3-235B-A22B has 118 `.safetensors` files at 4GB each.

There are a bunch of models and quants between those.



Does it run in 8x80G? Or does the KV cache and other buffers push it over the edge?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: