Hacker Newsnew | past | comments | ask | show | jobs | submit | cyclecycle's commentslogin

Great to see this picked up.

Nick Morley from Grounded AI here (https://groundedai.company)

We ran the analysis in collaboration with Nature :)


Nick Morley from Grounded AI here (https://groundedai.company)

We collaborated with Nature here to study the extent of fake/frankenstein citations in scholarly literature (from top 5 publishers - Springer, Elsevier, Wiley, Sage, Taylor & Francis)

We're estimating hundreds of thousands of papers affected in 2025 with hallucinated citation issues

As part of the work we analysed 20k papers generated with ChatGPT API to figure out which citation errors are characteristic of gen AI use and use that classify the errors we saw in the wild.

The world's gone mad, publishing is in a nuts state, the training data is poisoned!


yeah this makes sense. we run a citation verification service and provide publishers with data of hey this citation could be fake etc. but we don't currently capture any "action" or "measured result" so i guess that's what we need to expand to next


Trying to think about our best chance of making a lasting contribution (therefore actually surviving?) as a startup.

I'm a big fan of focusing on verification tools ("Verification is all you need"?) but who knows.

I'm sure I'm missing something in my perspective. Here to expand my aperture


A classic case.

I work on Veracity https://groundedai.company/veracity/ which does citation checking for academic publishers. I see stuff like this all the time in paper submissions. Publishers are inundated


Don’t publishers ban authors who attempt such shenanigans?


That's basically what we're doing with app.studyrecon.ai.

What we've found is that vector similarity is often not the final solution. It is still only a crude proxy for the true goal of 'informativeness' or 'usefulness' with relation to the user goal/query. Works okay, but we're definitely seeing a need for more rigorous LLM-postprocessing to enrich the results set.

Which, yes, the time adds up quick!


Here's my attempt at a simple explanation of transformers. I would love feedback on whether I've got it right and how I could improve it. Cheers


Some personal reflections on the direction of my work.


We're working on this at Grounded AI (https://www.groundedai.company/contact-us). We'd love to help you if we can. Feel free to contact me (email is on my profile page)


I agree, it's currently very superficial. It would be great to recognise and show more about the specific nature of the relationships.

What kind of thing would you hope to see here? A textual summary of the relationship? Or perhaps there is more that can be done with shapes and colours in this area?


This is a rich topic and great question, one to which we should all try offer answers.

From my experience, I work with topic maps - extremely rich knowledge graphs, and an ISO 13250 standard, and I work with NLP, which entails a lot more detail than just talking about relations: they are first class citizens (subjects) in a graph, not just labeled arcs. I spoke about them here [1]

[1] https://www.slideshare.net/jackpark/lbd-tm2


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: