It's not supposed to spit out truth. Meta may have overhyped it but they were careful not to claim that it makes true claims, but that it compiles knowledge, and they had a disclaimer that it hallucinates fact. Anyone who has seen language models knows that if you ask it about space bears, it won't reply that there re no space bears, instead it will try to create a link between space and bears. It seems to me that people were deliberately using impossible inputs so that they can claim the model is dangerous for doign what a language model does. And their ultimate purpose was to defame the model and take it down. (And BTW we 've heard those "dangerousness" BS claims before)
The usefulness of the model was in under-researched fields and questions, where it can provide actually useful directions.
My whole point is said directions are completely useless if you can't trust the system to be based on factual principles.
It's easy to dismiss "bears in space" as not factual. It's harder to dismiss "the links between two underresearched fields in biology" without putting in an exorbitant amount of work. Work which likely is going to be useless if the model is just as happy to spit out "bears in space".
And that was what I had already said. I asked you how you could possibly trust those links it provided. Because you can't. They may very well be novel links that had never been researched. Or they could have been bears in space.
A scientific model which doesn't guarantee some degree of factuality in its responses is completely useless as a scientific model.
They may have had a disclaimer, but they also presented it as a tool to help one summarize research. It was clearly unreliable there. And who cares if it has some good results? How do you know whether it's spitting nonsense* or accurate information?
Yes, I do think there is value in engaging with the community for models that aren't perfect. But it needs more work before it can be framed as useful tool for scientific research.
* I mean subtle nonsense. Space bears are easy to distinguish..but what if the errors aren't as 'crazy'?
The usefulness of the model was in under-researched fields and questions, where it can provide actually useful directions.