We're not gonna see significant model shrinkage until the money tap dries up. Be...

lanthissa · 2025-12-22T20:26:56 1766435216

isn't gemini 3 flash already model shrinkage that does well in coding?

skippyboxedhero · 2025-12-23T02:29:26 1766456966

Xiaomi, Nvidia Nemotron, Minimax, lots of other smaller ones too. There are massive economic incentives to shrink models because they can be provided faster and at lower cost.

I think even with the money going in, there has to be some revenue supporting that development somewhere. And users are now looking at the cost. I have been using Anthropic Max for most of this year after checking out some of these other models, it is clearly overpriced (I would also say their moat of Claude Code has been breached). And Anthropic's API pricing is completely crazy when you use some of the paradigms that they suggest (agents/commands/etc) i.e. token usage is going up so efficient models are driving growth.

hedgehog · 2025-12-22T20:29:45 1766435385

Smaller open-weights models are also improving noticeably (like Qwen3 Coder 30B), the improvements are happening at all sizes.

cmrdporcupine · 2025-12-22T20:34:51 1766435691

Devstral Small 24b looks promising as something I want to try fine tuning on DSLs, etc. and then embedding in tooling.

hedgehog · 2025-12-22T23:43:04 1766446984

I haven't tried it yet, but yes. Qwen3 Next 80B works decently in my testing, and fast. I had mixed results with the new Nemotron, but it and the new Qwen models are both very fast to run.

mark_l_watson · 2025-12-23T12:51:18 1766494278

Same experience: on my old M2 Mac with just 32B of memory both Qwen 3 30B and the new Nemotron models are very useful for coding if I prepare a one-shot prompt with directions and relevant code. I don’t like them for agentic coding tools. I have mentioned this elsewhere: it is deeply satisfying to mix local model use with commercial APIs and services.

Imustaskforhelp · 2025-12-22T21:28:32 1766438912

How much billion parameter model is gemini 3 flash, I can't seem to find info about it online.

naasking · 2025-12-23T14:15:53 1766499353

> We're not gonna see significant model shrinkage until the money tap dries up.

I'm not sure about that. Microsoft has been doing great work on "1-bit" LLMs, and dropping the memory requirements would significantly cut down on operating costs for the frontier players.