Hacker Newsnew | past | comments | ask | show | jobs | submit | st-keller's commentslogin

Happens a lot to highly reasonable, well written texts without orthographical errors.

So countrys with free education, universal healthcare, 24 days of mandatory vacation and a retirement age of 63 are loosers?


There is a social meme going around about Mississippi being wealthier than Bordeaux, France [1] that relates to the “Mississippi Question" (the question is whether a country is poorer than Mississippi, which is used as a benchmark for low income and wealth within the U.S). Broadly speaking, it speaks to GDP and economic numbers being poor metrics for tracking happiness and quality of life [2] [3] [4]. Mississippi GDP per capita is higher than Spain, Italy, the UK, and France; where would you rather live as a human?

Edit: I do like that this uses time as a measure versus fiat; fiat can be gamed, time consumed to meet a need cannot.

[1] https://www.threads.com/@sarahbesingrand/post/DJGCx47sqPI/th...

[2] https://mises.org/mises-wire/britain-france-and-spain-poorer...

[3] https://uk.finance.yahoo.com/news/poorest-us-state-rivals-ge...

[4] https://www.usnews.com/news/best-states/mississippi


Is Mississippi really that bad? The economy is indeed very bad somewhere like the UK.


I was born in Oklahoma. There is a phrase there.

"Thank God for Mississippi!"

Otherwise Oklahoma would be last in basically every single metric used to compare states. It was more true then than now. Mississippi made major changes to their education system and it has been paying off. I'm not to familiar with how they are doing on other metrics.


It also ignores quality of the goods.

But more importantly, it ignores what you can do with the extra hours you work.. same as with PPP it ignores that an iPhone isn't cheaper (it may not be a basic necessity), but many globally traded goods that go beyond basic needs aren't affected by PPP.


Scope note: This index is a floor affordability ratio for non-tradables (rent, utilities, basic food, transit). It doesn’t rate product quality or luxury/tradable goods.

Quality: True—quality varies. We use a fixed, minimal basket to avoid hedonic debates; I can add sensitivity bands by quality/spec.

“What extra hours buy”: Good point. We’ll add Discretionary Hours = paid hours/month − hours to essentials to show room for non-essentials, saving, leisure.

iPhone / tradables: Different lens. Many tradables price similarly across countries; essentials are mostly local/non-tradable and drive this metric. We can add a companion “tradables-hours” (e.g., hours to buy an iPhone/streaming bundle).

Takeaway: Essentials-hours ≠ welfare. It’s one clean piece—time to cover basics—best paired with discretionary and tradables views.


Don't get me wrong I think these metrics are interesting.

But not useful for comparing Bolivia and Sweden.

But comparing nordic countries this way makes a bit sense.

Comparing emerging economies this way might also make some sense.

But there is always lots of welfare not covered in these metrics.


Not a welfare ranking. This measures one thing: hours of work to cover a small monthly essentials basket (rent, utilities, basic food, transport) = price ÷ wage. Healthcare/education/vacation are mostly outside that basket or already baked into hourly pay.


So to put this another way, this is supposed to measure the avg absolute buying power of a worker and compares them by country? The part where I can see the disconnect here is a lot of European countries have their social programs baked into the wage (eg I expect to get paid less to work in Sweden as opposed to the US because I'm getting some of that 'back' through social services) however that isn't always the case for some countries. For example I'm not deducting healthcare costs from my salary in the US, but I'm still paying it after the fact which decreases my spending power.

There's too many specific variables to account for in this that I feel like the general comparison is almost hurtful at worst or doesn't tell the whole story at best.


More speech! The signal vs. noise-ratio shifts. So access to information will become more difficult. More disinformation and outright nonsense will make it more difficult to get to the valuable stuff. Ok - let‘s see how that works!


I’m not setting boundaries - and I think you will notice when to shift! When noone and no book seems to teach you anything new about a certain topic. When you read the latest book and think, „hmm yes, but i think they omitted a and b and c and in reality this isn’t so nicely separated and way more nuanced“. Then it is definitely time to shift your attention to a different topic. I’m doing this for decades now - and i think lots of others too. You will never learn all about everything. But it’s fun to try :-)


That‘s exactly why i like AI too. I even let them play roles like „junior dev“, „product owner“ or „devops engineer“ and orchestrate them, to play together as a team - with guidance from me (usually the „solution architect“ or „investor“)! This „team“ achieves in weeks what we usually needed months for - for 2.40€/h*role!


This sounds a bit like creating a Humanoid robot to do dishes instead of a machine specifically designed to do dishes.


I can't tell if you are being sarcastic but this sounds absurd. Why let the AI be junior, why not an expert?

This persona driven workflow is so weird to me. Feels like stuck in old ways.


Avoiding context bloat and scoping the chain of thought


„This renders the meaning of significance-testing unclear; it is calculating precisely the odds of the data under scenarios known a priori to be false.“

I cannot see the problem in that. To get to meaningful results we often calculate with simplyfied models - which are known to be false in a strict sense. We use Newtons laws - we analyze electric networks based on simplifications - a bank-year used to be 360 days! Works well.

What did i miss?


The problem is basically that you can always buy a significant result with money (large enough N always leads to ”significant” result). That’s a serious issue if you see research as pursuit of truth.


Reporting effect size mitigates this problem. If observed effect size is too small, its statistical significance isn't viewed as meaningful.


Sure (and of course). But did you see the effect size histogram in the OP?


Are you referring to the first figure, from Smith, et al, 2007? If so, I couldn't evaluate whether gwern's claim makes sense without reading that paper to get an idea of, e.g., sample size and how they control for false positives. I don't think it's self-evident from that figure alone.

One rule of thumb for interpreting (presumably Pearson) correlation coefficients is given in [0] and states that correlations with magnitude 0.3 or less are negligible, in which case most of the bins in that histogram correspond to cases that aren't considered meaningful.

[0]: https://pmc.ncbi.nlm.nih.gov/articles/PMC3576830/table/T1/


I’m not arguing that there’s something fundamentally wrong with mathematics or the scientific method. I’m arguing that the social norms around how we do science in practice have some serious flaws. Gwern points out one of them. One that IMHO is quite interesting.

EDIT: I also get the feeling that you think it’s okay to do an incorrect hypothesis test (c > 0), as long as you also look at the effect size. I don’t think it is. You need to test the c > 0.3 hypothesis to get a mathematically sound hypothesis test. How many papers do that?


My opinion of Gwern's piece is that some of the arguments he makes don't require correlations. For example, A/B tests of differences in means using a zero difference null hypothesis will reject the null, given enough data.

In that A/B testing scenario, I think if someone wants to test whether the difference is zero, that's fine, but if the effect size is small, they shouldn't claim that there's any meaningful difference. I believe the pharma literature calls this scenario equivalence testing.

Assuming a positive difference in means is desirable, I think testing for a null hypothesis of a change of at least some positive value (e.g., +5% of control) is a better idea. I believe the pharma literature calls this scenario superiority testing.

I believe superiority testing is preferable to equivalence testing, and in professional settings, I have made this case to managers. I have not succeeded in persuading them, and thus do the equivalence testing they request.

I don't think the idea of a zero null hypothesis is necessarily mathematically unsound. In cases like the difference in means, a zero null hypothesis is well-posed. However, I agree with you that there are better practices, like a null hypothesis incorporating a nonzero effect.

I don't entirely agree with the arguments Gwern puts forth in the Implications section because some of them seem at odds with one another. Betting on sparsity would imply neglecting some of the correlations he's arguing are so essential to capture. The bit about algorithmic bias strikes me as a bizarre proposition to include with little supporting evidence, especially when there are empirical examples of algorithmic bias.

What I find lacking about Gwern's piece is that it's a bit like lighting a match to widespread statistical practice, and then walking away. Yes, I think null hypothesis statistical testing is widely overused, and that statistical significance alone is not a good determinant of what constitutes a "discovery". I agree that modeling is hard, and that "everything is correlated" is, to an extent, true because the correlations are not literally or exactly zero. But if you're going to take the strong stance that null hypothesis statistical testing is meaningless, I believe you need to provide some kind of concrete alternative. I don't think Gwern's piece explicitly advocates an alternative, and it only hints the alternative might be causal inference. Asking people who may not have much statistics training to leap from frequentist concepts taught in high school to causal inference would be a big ask. If Gwern isn't asking that, then I'd want to know what a suggested alternative would be. Notably, Gwern does not mention testing for nonzero positive effects (e.g., in the vein of the "c > 0.3" case above). If there isn't an alternative, I'm not sure what the argument is. Don't use statistics, perhaps? It's tough to say.


Thanks for the extensive answer.

> I don't think the idea of a zero null hypothesis is necessarily mathematically unsound. In cases like the difference in means, a zero null hypothesis is well-posed. However, I agree with you that there are better practices, like a null hypothesis incorporating a nonzero effect.

I don’t think a zero null hypothesis is mathematically unsound of course. But I think it is unsound to do one and then look at the effect size as a known quantity. It’s not a known quantity, it’s a point estimate with a lot of uncertainty. The real underlying correlation may well be a lot lower than the point estimate.

And of course it’s hard to get people in charge interested in better hypothesis testing. That testing will result in fewer conclusions being drawn / fewer papers being published. It’s just another symptom of the core issue: it’s quite convenient to be able to buy the conclusions you want with money.


Back when I wrote a loan repayment calculator, there were 47 common different ways to 'day count' (used in calculating payments for incomplete repayment periods, e.g in monthly payments, what is the 1st-13th of aug 2025 as a fraction of aug 2025?).


There is a known maximum error introduced by those simplifications. Put the other way around, Einstein is a refinement of Newton. Special relativity converges towards Newtonian motion for low speeds.

You didn't really miss anything. The article is incomplete, and wrongly suggests that something like "false" even exists in statistics. But really something is only false "with a x% probability of it actually being true nonetheless". Meaning that you have to "statistic harder" if you want to get x down. Usually the best way to do that is to increase the number of tries/samples N. What the article gets completely wrong is that for sufficiently large N, you don't have to care anymore, and might as well use false/true as absolutes, because you pass the threshold of "will happen once within the lifetime of a bazillion universes" or something.

Problem is, of course, that lots and lots of statistics are done with a low N. Social sciences, medicine, and economy are necessarily always in the very-low-N range, and therefore always have problematic statistics. And try to "statistic harder" without being able to increase N, thereby just massaging their numbers enough to get a desired conclusion proved. Or just increase N a little, claiming to have escaped the low-N-problem.


A frequentist interpretation of inference assumes parameters have fixed, but unknown values. In this paradigm, it is sensible to speak of the statement "this parameter's value is zero" as either true or false.

I do not think it is accurate to portray the author as someone who does not understand asymptotic statistics.


> it is sensible to speak of the statement "this parameter's value is zero" as either true or false.

Nope. The correct way is rather something like "the measurements/polls/statistics x ± ε are consistent with this parameter's true value to be zero", where x is your measured value and ε is some measurement error, accuracy or statistical deviation. x will never really be zero, but zero can be within an interval [x - ε; x + ε].


As you yourself point out, a consistent estimator of a parameter converges to that parameter's value in the infinite sample limit. That limit is zero or it's not.


It's a quantitative problem. How big is the error introduced by the simplification?


Ok is this about a reactive app with with a local database automatically synced to a remote db? All fully encrypted (at rest and in transit)? I thought this is what everyone does nowadays! We built an app like this in 2019 - yes - it was a bit of a challenge with the encryption but the „syncing data“-part is what every litte multiplayer-game has to deal with like forever now. Seems i‘m out of touch with the current state of affairs. Nice article though!


The value is in the author’s experience with other tools that caused problems as his database grew, and in learning from his reasoning about the appropriate solution for his particular problem.


Availability of nutrition rich soil to grow stuff is one thing - google some facts about fertilizers and the resources they are made of. Clean water is also not a thing you can take for granted.


What facts? There was a big scare about ten years ago about how we are going to run out of phosphorus in about five years. Look for 'peak phosphorus' or so.

Obviously that hasn't happened.


I also like showing up at places and don’t give a shit what other people think. It’s even more fun not being wealthy at all.


Wow - nice to know that this old story has survived for so long! I remember reading it a long time ago. Has this phemomenon been repicated by someone or has someone invented something because of that?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: