I'm getting ~30 tok/s on the A3B model with my 3070 Ti and 32k context. > Do you... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		bigyabai 9 days ago \| parent \| context \| favorite \| on: Something is afoot in the land of Qwen I'm getting ~30 tok/s on the A3B model with my 3070 Ti and 32k context. > Do you feel you could replace the frontier models with it for everyday coding? Would/will you? Probably not yet, but it's really good at composing shell commands. For scripting or one-liner generation, the A3B is really good. The web development skills are markedly better than Qwen's prior models in this parameter range, too.

		help

jasonjmcghee 8 days ago | [–]

That seems oddly low / slower by a fair amount than i get on my m4. (I believe it was ~45 tok/s?)

What quant are you using? How much ram does it have?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact