Tried an English to Greek translation with the smaller one. Results were hideous...

orbital-decay · 2025-08-06T11:33:55 1754480035

They used quantization-aware training, so the quality loss should be negligible. Doing anything with this model's weights would be a different story, though.

The model is clearly heavily finetuned towards coding and math, and is borderline unusable for creative writing and translation in particular. It's not general-purpose, excessively filtered (refusal training and dataset lobotomy is probably a major factor behind lower than expected performance), and shouldn't be compared with Qwen or o3 at all.