Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>There are on the order of 100 million papers [reference 2] published to date.

Does anyone else feel as if this (admittedly rough) estimate is off by an order of magnitude?



OpenAlex has 240M. https://docs.openalex.org/api-entities/works

CORE has 431M. https://core.ac.uk/data

Crossref has 165M. https://www.crossref.org/blog/2025-public-data-file-now-avai...

These datasets are all biased towards work published in the digital age, but it's important to note that work is coming out much faster now than it used to.


So indeed, order 10^9 not 10^8, given the CORE at > sqrt(10)*10^8.


Is that because there is a pressure to publish? As I wouldn't say we make advancements at a rate any different during the last two decades than we have over the 20 years prior to that.


If 1% of the last 10 billion people to live were academics and published on average 5 papers (many only had one, i.e. their dissertation/thesis, but a small fraction will have had dozens or hundreds), that comes to 500 million.

I'm curious, do you think it's an order of magnitude too low or too high?


I think it's too low.


MEDLINE (health / life science) has 37M papers.

IIRC the rate of publishing was superlinear thus the curve of actual publications goes faster than the quadratic function.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: