Such a dramatic name for such a boring set of tests. We need to test whether it ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		rednafi 11 months ago \| parent \| context \| favorite \| on: Humanity's Last Exam Such a dramatic name for such a boring set of tests. We need to test whether it can come up with a Nobel Prize-winning scientific breakthrough, a Booker/Pulitzer-worthy novel, Ken Thompson-level code that solves a real problem, or a proof for Fermat’s Last Theorem.

egillie 11 months ago | [–]

This makes me wonder if you could train an llm without any references to Wiles’ work and see if it can compete Fermat’s last theorem

vitiral 11 months ago | [–]

None of those are easily verifiable

rednafi 11 months ago | [–]

Then they probably shouldn't have called it "Humanity's last exam." Kinda lame, if you think about it.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact