Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The base rate arguing seems like specious reasoning. For example, if you had a volcano that erupts roughly ever 100 years, base rate reasoning using the past 99 years of data would suggest that the probability is 0 and using 1000 years of data would suggest it's ~10% when in reality your base rate in the year following an eruption is 0 with every passing year your probability of an eruption would increase & increase past 10% for every year past 100 that goes without explosion. Same goes for something like war where pressures build up and war becomes more likely rather than less. So getting judged that you're better at predicting by giving low probabilities for rare events isn't that insightful because you'd be outperformed by someone who predicts a black swan event because the magnitude of the event matters.

> The prediction got some press attention and earned rejoinders from nuclear experts like Peter Scoblic, who argued it significantly understated the risk of a nuclear exchange. It was a big moment for the group — but also an example of a prediction that’s very, very difficult to get right. The further you’re straying from the ordinary course of history (and a nuclear bomb going off in London would be straying very far), the harder this is.

Yup, the group got it right but predicting a rare event doesn't happen isn't that difficult, it's just notable because everyone was overly freaked out, particularly in the media due to self-repeated sensationalism. Peter Scoblic is correct that the risk is significantly understated because it's not correctly adjusting for the impact of the black swan event happening (e.g. if a nuclear explosion were to occur, you'd expect nuclear retaliations).



> base rate reasoning using the past 99 years of data would suggest that the probability is 0

Looking at edge cases is good for sanity checking, so it's a good habit, and I commend you.

In your example, though, we can also consider the base rate of an event which hasn't happened in 99 years as 1/101, per Laplace's rule of succession. https://en.m.wikipedia.org/wiki/Rule_of_succession


This is a great application of bayesian approach.


I think the article focused on base rates because they're a relatively unusual and legible "trick" to coming up with a forecast, but really they're only one element of a forecast; typically a forecaster will think about many different ways to "attack" a question and synthesize them (somehow!). Choice of denominator for your base rate is very important also and can radically change the answer you get.

The sites which host these forecasting competitions correct for the bias against rare events through what's called "proper scoring" rules -- there's some specific maths to it, but the short version is that you're exponentially rewarded for being a correct contrarian and exponentially punished for being confidently wrong.

There are limits to that too, of course -- the folks in the article will "only" have made on the order of mid hundreds to low thousands of predictions, so roughly speaking, you can expect these people to be calibrated for 1% or 0.5% odds but probably not 0.1% odds.


Base rates work pretty well, at least for all cause mortality... hurricane counts per year, and financial markets (over very short time periods). I was using the Good Judgment project as motivation to practice R programming for a while, until one day I saw that literally EVERY person forecasting the ending value of the Hang Seng index tied to my probabilities. Therefore, EVERYONE was calculating base cases from historical market data and entering those results.


I think you mean ~1%, not ~10%. I’m not sure that the rest of what you describe is really what is meant by “base rate”. I think you’ve mixed a few concepts together:

- if you estimate the probability based on bad data, you get a bad answer.

- the base rate is a very simple model for the chance something happens – count similar events and divide by the number of potential similar events. One might describe it as an early prior before considering other information

- whether or not something (the eruption) happens in a year

- the probability one predicts for an event when considering more information. For example with the ‘pressure building’ model of the eruption you might decrease the probability immediately after an eruption and increase if it’s been a while (and increase a lot if smoke is coming out)

- sure it’s easy to be right predicting that unlikely events won’t happen. I think the claim of the OP is more that one may hope that the good prediction record transfers to the prediction of unlikely events


I think it's important to call out that 'base rate' means very different things to people who have taken an introductory Stochastics class. When you can bring in the memorylessness property of the exponential distribution, Poisson distributions and gamma distributions you can get some non-intuitive results.


A hermit lived by the volcano, and every day he would put up a sign for all the tourists hiking up to see it:"100% accurate prediction: Won't Erupt Today."

One day, 30 years later, the volcano's caldera began to spit magma and bubble with gas, so that morning the old hermit walked outside and changed his sign: "99.99% accurate"...


That is why it is important not only to consider the accuracy, but also the information gain of a prediction.


How are you calculating and using this information?


> base rate reasoning using the past 99 years of data

Like many things, if you do it badly it doesn’t work. If you had that little data, you’d look at the rate across many similar volcanoes.

Focusing on base rate makes you more effective than others because people tend to only focus on the delta from the base rate. Tensions are “elevated”. Ok, elevated from what? People don’t actually ask that. They pull a number out of thin air and double it.

If you intentionally consider “what is the base rate?” and “how is this different from the base case?” you empirically end up with better results.

You’re also not married to the base rate. If you think a factor makes the odds 100x higher, go for it. You just have to say that explicitly.


Isn’t nuclear war specifically a bad thing to try and make predictions about? The cheat is to always predict no.

It seems unlikely that the U.K. is getting nuked alone, they have a bunch of nuclear armed allies. Even if you don’t believe in quantum immortality, if you are predicting ‘yes,’ you only get to collect your points in the ‘yes, and I survive, and so does the betting market’ case.


Cute zeroth order approximation, but you can also shift consumption forward if you think a nuclear war is near, or devote more resources to preventing it if you think the probability is higher.


> if you had a volcano that erupts roughly ever 100 years, base rate reasoning using the past 99 years of data would suggest that the probability is 0

Sparsity is a problem whenever you use data to predict or model something. Your example here is essentially subsampling only zeros from a sparse time series. The existence of sparsity isn't a new insight that invalidates everything. It's a challenge to be overcome by careful practice.

> when in reality your base rate in the year following an eruption is 0 with every passing year your probability of an eruption would increase & increase past 10% for every year past 100 that goes without explosion.

Sure, with large amounts of prior knowledge, you can do better than a naive base rate starting point. I'm sure that practitioners know this. Even in this contrived example, the base rate would have been a good first guess.

> predicting a rare event doesn't happen isn't that difficult

Doing it accurately (in the sense of having a low Brier score) is apparently difficult.


OT, but are you talking about a specific situation here?

> something like war where pressures build up and war becomes more likely rather than less.

The likelihood of war between the UK and US, for example, has not steadily increased.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: