Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A couple of days ago, inspired by Simon and those discussions, I had Claude create 30 such tests. I posted a Show HN with the results from six models, but it didn’t get any traction. Here it is again:

https://news.ycombinator.com/item?id=45845717

https://gally.net/temp/20251107pelican-alternatives/index.ht...



Oh man, that’s hilarious. I dunno what qwen is doing most of the time. Gemini seems to be either a perfect win or complete nonsense. Claude seems to lean towards “good enough”.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: