> Surely GPT-4 would fail some exams. But when it comes to GPT-4's exam performance, only the positive results are reported.
The default is failing the exams. I'd be no less impressed if they came right out and said "This is a short list of the only exams it passes" simply because (IMO) it's remarkable that a machine could pass any of those exams in the first place. Just a couple years ago, it would have been outlandish for a machine to even have a double digit score (at best!).
If we've already found ourselves in a position where passing grades on some exams that qualify people for their careers is unremarkable, I'll honestly be a bit disappointed. 99th percentile on the GRE Verbal would make an NLP researcher from 2010 have a damn aneurysm; if we're now saying that's "not reasoning" then we're surely moving the goalposts for what that means.
The default is failing the exams. I'd be no less impressed if they came right out and said "This is a short list of the only exams it passes" simply because (IMO) it's remarkable that a machine could pass any of those exams in the first place. Just a couple years ago, it would have been outlandish for a machine to even have a double digit score (at best!).
If we've already found ourselves in a position where passing grades on some exams that qualify people for their careers is unremarkable, I'll honestly be a bit disappointed. 99th percentile on the GRE Verbal would make an NLP researcher from 2010 have a damn aneurysm; if we're now saying that's "not reasoning" then we're surely moving the goalposts for what that means.