Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, I've had similar results. Even with GPT-o1, I find almost all errors at this point come from the web search functionality and the model taking X random source as an authority. It's interesting that I find my human intelligence in the process is most useful for hand-collecting the sources and data to analyze -- and, of course, for directing the process across multiple LLM queries.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: