We're working on a large Rust codebase, heavily assisted development with Claude and Codex, and one critical workflow is after you have written a spec, have the other LLM critique it thoroughly.
This back and forth will take quite a while, but the resulting implementation plan will be 10x better than the original.
You can automate this by giving Codex a goal, and a skill to call Claude to review the implementation spec until they both agree it's done.
Then, for critical code, have them both implement the spec in a worktree, then BOTH critique each other's implementation.
More often than not, Claude will say to take 2 or 3 pieces from it's design over to Codex, but ship the Codex implementation.
I take this idea even further: After the LLMs have critiqued each other, I introduce a third critique and review it myself as a human. This third party review is most effective at highlighting problems that the LLMs miss, in my experience.
Jokes aside, I agree about having LLMs iterate. Bouncing between GPT and Opus is good in my experience, but even having the same LLM review its own output in a new session started fresh without context will surface a lot of problems.
This process takes a lot of tokens and a lot of time, which is find because I’m reviewing and editing everything myself during that time.
as someone who is about as llm-forward as anyone out there, this is a brilliant analogy. was equally true of all the “prompt engineer” hype as well from a couple years ago (which i admit i still think does matter)… it kinda makes me feel like an audiophile / hi-fi person talking about how 24bit/192kHz is the one true encoding format and anything less is a willfull (cynical, “Quality”-hating, satisficerist, etc.) compromise. which i freely admit to being one of those people as well.
and in both cases i both “know” that i can tell the difference and “know that i cannot tell the difference”. what anyone takes from that in terms of what it says about me, personally, is a bit of a Rorschack test, but Astrology is about as apt a description as there is… xD
For higher than audible frequency sample rates there's a good chance you can tell the difference. It often causes weird aliasing and harmonics in the more audible frequencies on "real" playback equipment. You can train yourself to recognize some of these and often pretty accurately identify the higher sample rate examples. You might even mentally associate those signs with "Higher Quality".
But it's arguably less accurate to the original recording.
Didn't multiple studies find the reasoning traces didn't have much to do with the final output? And even that outputting placeholder tokens during reasoning has a similar beneficial effect on benchmark scores?
(I don't think that's the full picture but, there's definitely something fishy going on there.)
reasoning itself just affords the model a ton of extra forward passes / "time to think"
the, como se dice, "misalignment" between the content of reasoning tokens and the actual output following the end of the reasoning is a separate problem, extensively studied by e.g. Anthropic
> Unless you can somehow provide some arguments against it,
We're year 4 into this discussion and camps have only gotten more bifrucated. There's no 1-1 discussion to have about this as of now, at least not before the crash.
Your only hope in such discourse is not trying to convince the other party how wrong they are, but appealing to an as of yet undecided party. Be it with reason, or simply pointing out how absurd some comments sound to the average person.
> Your only hope in such discourse is not trying to convince the other party how wrong they are
I don't care about convincing anyone, the ones I reply to or others, but if you take the time to leave a comment, at least make it something to read and think about instead of soundbites like "This is astrology for devs", it's plain boring to read and makes HN worse.
i legitimately cannot divine what you are saying at all with this. there are so many dangling antecedents and modifiers that it is completely impossible. and i say this out of a genuine desire to understand what your argument is, knowing full well that i likely disagree with it.
Alright, let me explain, hopefully simpler: GP made told us their experience with working with LLMs, and some pointers to what they found to be working. The comment I replied to just says "This is astrology for devs" which basically is a cheap putdown without any reasoning nor arguments for why the commentator believes so. My comment is urging them to actually participate in the discussion, not just post their soundbite they thought of in five seconds, so HN as a whole can remain good instead of devolving into reddit (which is a tale as old as HN, I know).
Hopefully it's understandable now, and hopefully you don't disagree :)
You can't be serious. It couldn't be more obvious what the poster was referring to, a drive by put-down comment with no attempt to discuss anything seriously is more highly upvoted than an objection to such a comment.
What is this place for? Dang tells us, curious discussion. The guidelines explicitly state that certain comments are not in the spirit.
But the community seems to have decided otherwise, which is a shame.
Don't read too much into it, downvotes/upvotes are highly random here, saying the same thing twice will have different reactions depending on the time of day and the topic of the submission, seems certain crowds are drawn to certain topics, which isn't that surprising.
I don't mind the downvotes, the points aren't really the reason I'm here anyways, I just want fun and interesting discussions with people and read other's perspectives, the points don't hinder that :)
This is precisely how I used to use Beads before I made GuardRails (I wanted something slightly simpler, but similar with more 'guard rails'). I braindump everything I want to build, I ask Claude to do market level research. I then ask Claude to ask clarifying questions, when I ask Claude to be critical of its conclusions and provide the top options and to justify it. I also question Claude and say its okay to disagree with me, be critical, I just want to understand.
By the end you have piecemeal "tickets" for your coding agent, if you have multiple developers you can sync them all up into github, and someone could take some locally, or you can just have Claude work on all of them with subagents. The key feature there is because its all piecemeal the context stays per task.
Then I run a /loop 15m If you're currently working ignore this. Start on the next task in gur if you have not. If you finished all work and cannot pass one gate, work on the next available task.
(Note: gur is my shorthand for GuardRails)
I also added a concept called "gates" so a task cannot complete without an attached gate, gates are arbitrary, they can be reused but when assigned to a task those specific assignments are unique per task. A task is basically anything you want it to be: unit test, try building it, or even seek human confirmation. At least when I was using Beads it did not have "gates" but I'm not sure if it has added anything like it since I stopped using Beads.
Claude will ignore the loop if it's currently working, and when its "out of work" it will review all available tasks.
If anyone's curious its MIT Licensed and on GitHub:
I hate how seriously people take the output of an LLMs or how reliable they think it is.
Have Claude produce that spec 10 times, use the same prompt and same context. Identical requests, but you'll get 10 unique answers that wil contradict each other with each response seeming extermely confident.
Its scary how confident you people are in these outputs.
If you ask 10 different humans to produce the spec with the same information (prompt and context) they will also produce 10 unique answers that will contradict each other and (depending on who you asked) may be just as confident.
There are real decisions to be made when going from a vague prompt to a spec. It's not surprising that an LLM would produce different specs for the same work on different runs. If the prompt already contained answers to all the decision points that come up when writing the spec then the prompt would already be the spec itself.
I didn’t claim that LLMs are people or that they reason.
If the behavior of the llm is the same as the behavior of reasonable people then the behavior of the llm is reasonable, regardless of how black of a box they generate tokens out of.
Reasonable people will generate divergent specs for the same prompt. Thus it is reasonable for an LLM to generate divergent specs out of the same prompt.
Edit: I use “reasonable” here in the legal sense of the “reasonable person” standard, not to imply any reasoning process.
Aren’t people pattern matching neural networks as well? Why does being a token generator mean something is unreliable?
Further, why does that mean “it doesn’t reason”. Logic can be encoded in language, symbols or code. If I say “all apples are red” -> “all fruit in the bowl are apples” -> “therefor all the fruit are red”. It doesn’t really matter if I understand the logic or what red is or fruit/apples are, the logic is contained in the structure of the syntax. If an LLM can output the conclusion reliably from predictive operations it is able to have the effect of reason and we don’t need to know or care about whether it “understands” the reasoning.
it's an analogy, it didnt fall on its face at all. it's just a comparison to highlight the point being made was nonsensical. example: you're just a next action generator controlled by trillions of cells and subconscious dna-based behavior. a black box.
moral agency and the ability to learn are implicit in the description you quoted. this isn't some special superpower, all animals have the ability to learn, and many have moral agency. these aren't human specific traits
why do people insist on claiming that they don’t reason, when they clearly, for all intents and purposes, do. you can be vague; you can express your idea a thousand different ways, and you will get a unique blend of <your input bits> x <hidden reasoning layer> => semi-smoothed output. this is like some Searle Chinese Room bullshit that needs to just die. it is beyond clear that llms can interact with abstract concepts in an extremely meaningful way. this is like the “thought leader” version of the stupid-ass “it’s just smart autocomplete” argument. if you think that, it is user error— either a failure of creativity or a failure of perception or both. just because llms are not a panacea and are problematic for society and “overhyped” and whatever does not make it intellectually honest to claim that there is zero reasoning/creativity/cognition within the box.
It appears they don't need to reason or be intelligent to be able to produce working solutions for code. But sure let wild and unmonitored? I wrangle my LLMs like the code monkeys they are. They help materialize code and then you need to sculpt it (and test harness of varying sorts)
It really can be useful. It's very different from old world programming.
They are token predictors that use statistical techniques to emit the randomly weighted next most likely token given the previous token list.
The result is a strange mimic of human reasoning, because the tokens it predicts are trained on strings that were produced by humans that were reasoning, but that's not the same thing.
Human cognition is complex and poorly understood, and the nature of the mind is an area of study almost as old as consciousness itself. We don't know exactly how it works, or what its exact relationship to the brain is, but we do know that it is not a simple token predictor.
LLMs, by their very nature are constrained to the concept of language and the relationship between existing words in a corpus. This is a box they can not escape.
Modern neuroscience suggests that the human brain is much more vast than that, and in many ways looks like it is constrained by language, but certainly not limited to it.
> They are token predictors that use statistical techniques to emit the randomly weighted next most likely token given the previous token list.
Sounds like an implementation detail. Now describe how human reasoning works and explain why that process of chemical and electrical signals results in "reasoning" whereas what LLMs do isn't.
The problem with being this reductive is you can do it to anything, including humans. You can’t be reductive about LLMs and refuse to be reductive about humans - that's poor reasoning, and an LLM would out-reason you on this point, further negating your case.
Human cognition is poorly understood and much more complex than it seems.
For an example, look at some of Julia Mossbridge's work.
If even a small part of her work is true and valid, it points to something far outside our current framework.
You don't need to go as far afield as Mossbridge, though - that's an extreme example. Pretty much any modern neuroscience will make you question a lot of assumptions, at least it did for me.
> For an example, look at some of Julia Mossbridge's work.
Never heard of her but I just spent about 5 minutes looking.
Her PhD is in communication sciences and disorders [1], but apparently she’s a quantum physicist now:
> AMELIA is built on the Causally Ambiguous Duration-Sorting (CADS) effect — a breakthrough discovery by Dr. Julia Mossbridge showing that light, under classical boundary conditions, behaves differently based on future temporal boundaries.
[2]
Filed under crank, not going to bother investigating further.
You have moved goalposts from reasoning to "human cognition". I won't tolerate that sort of slippery wordplay.
Reasoning is making analogies between logical patterns found in conceptual space, with a direction of time (statements precede conclusions). For example. A => B and B => C. You may now deduce A => C. For something fuzzier, A~D and B~E, you may now deduce that D~=>E. This is the sort of thing that higher layer attention mechanism is capable of doing.
> This is a box they can not escape.
Would you say that Helen Keller was less capable of abstract reasoning because she had more constrained access to sensory input?
Reasoning requires cognition, otherwise there's nothing to reason about, no context or value system to use as a basis for reason.
Decision making can be done by trained machines following rules, but that's different that reasoning. A thermostat isn't reasoning when it decides to turn on the air conditioner, to argue otherwise expands the definition of "reason" to be so broad that it becomes useless.
LLMs are trained on human knowledge and reasoning that results from human cognition, and they are excellent at stochastic mimicry - if the argument is that they are actually reasoning, then some sort of equivalent to human cognition must be present for that to be true. Lacking that, they are nothing more than "token extrusion machines" with some potentially useful characteristics.
Why does reasoning require cognition? Isn’t a if else block or switch statement reasoning? Or a formal logic proof? If an LLM produces an output using formal logic or a python script why is that not reasoning? A human would offload the reasoning using similar methods. I know when I took the LSAT, I learned ways to diagram arguments and didn’t have to think/reason about it because the formal logic diagram did the “reasoning for me”.
Aren’t humans just “action potential” extrusion machines? What is unique about our neural pattern recognition to make our cognition different in nature rather than merely degree?
It seems clear at this point that the greatest insight that unlocked our current AI acceleration was scaling alone would unlock emergent properties and abilities.
The problem with that is LLMs can output words or symbols that seen like it used "reason" to produce. But for everything the core algorithm does, it's simply nothing like the wetware reasoning to get to the same answer. So he didn't move goalposts. He always meant the reasoning that stems from human cognition.
Technically if it has that, it'd be singularity no? So basically the premise is they are doing nothing of the sort. Prove any LLM enough and it really does show it has no quarrels contradicting itself or being bossed around. Has no belief / no orientation etc. It's truly mindless but tricks our mind and soul (or whatever) probably.
> Technically if it has that, it'd be singularity no?
reasoning is not black and white. It is possible to reason poorly. Most people cannot do basic math proofs, even math majors struggle with the hardest math proofs. Reasoning in humans is also context/token dependent. I just spent one HOUR trying to show my mom (who has mild dementia) how to use amazon fire (push DOWN until your channel shows up, push RIGHT until the channel becomes big) and she could not figure it out. Rewrote the instructions in japanese and she followed the logic relatively smoothly. Ironically, i'm pretty sure her english is better than her japanese, vocabulary wise.
> it's simply nothing like the wetware reasoning to get to the same answer.
but you don't know how wetware reasoning works, so you are incapable of making that proclamation. I'm pretty sure when I do math proofs (I'm not an amazing mathematician) sometimes I have to literally tick my way through each step of the proof, sometimes breaking it down to super-basic substeps, which to me feels awful lot like what an LLM could be doing. For that matter we don't know how LLM reasoning works but my claim is that these LLMs are in principle capable of reasoning due to architecture.
If this doesn't make sense I suggest you look over the architecture of LLMs carefully and try to understand my point.
(BTW I'm not talking about "reasoning models" with "thinking turns", that's just marketing speak, I'm talking about ANY transformer-based model, even the "dumbest UX architecture" completion models)
The structure of language encodes logic in many ways. So the models ability to reason may be an emergent property of the reasoning ability humanity has ejected an extracted from our neural networks and abstracted into language a symbols.
You are asking the wrong question. It's not about if you can do X which can be faked especially if you are given practically infinite tries and all failures are hidden.
The people who want to believe they actually reason just ignore all obvious evidence of contrary and cherry pick the times reasoning was faked well enough.
The people who don't want to believe will just take a second to understand how they work and then come up with ways to reveal they were faking all along. Like asking how many letters there are in a word lol.
It's only the people who don't want to believe that count because reality is what happens despite of what you believe.
You seem to believe that something is only "reasoning" if it works in a particular way. That it's not enough for it to observationally display reasoning skills; it has to be using a particular method to do that so it's not "faking" it. Is that correct?
The issue is Lllms don't learn, despite the name. A human re-implementing a spec would strive to iterate towards what they feel is a better spec. They can take in their own input and self-correct. The work of implementing the spec gives insight into pain points and strengths, even if they never actually test the spec (they 100% should, but this is to emphasize that struggle for humans is in itself iteration, even before external feedback comes in).
An LLM is isn't deterministic but also isn't iterative without an existing human. You give it the same spec 10 times and it produces 10 results that aren't far off itself but vastly different when you go into the weeds. And not different in a way of improvement. |
An LLM should not "generate specs", a human should. The LLM can work from the specs. It can never infer meaning from a vague prompt. If so, it will start guessing. Every human that ever did functional specification or information analysis at some point knows this. Or has learned the hard way, something with assumptions and asses ;)
The guessing of a LLM for a vague prompt is better than the one of your average developer.
A prompt like "write these two files on disk" will very likely make the LLM do some sort of an atomic write/swap operation, unlike the average developer which will just write the two files and maybe later encounter a race condition bug. You can argue the LLM output is overkill, but it will also be more robust on average.
So what’s most important is knowing those parameters and the ranges of values, not having the final result. A human, after producing a specs, can the provide the mental model of how he created the specs. Where the inflection points are and what the range of valid results.
What has always mattered is how you decide the specs, not the specs in themselves.
> If you ask 10 different humans to produce the spec with the same information (prompt and context) they will also produce 10 unique answers
But they didn't ask humans, they asked a machine. We expect our machines to behave in predictable ways.
> If the prompt already contained answers to all the decision points that come up when writing the spec then the prompt would already be the spec itself.
This is one of the best arguments against using LLMs I've seen.
It reduces to the classic argument- at the point where you've described a problem and solution in sufficient detail to be confident in the results, you've invented a programming language.
> We expect our machines to behave in predictable ways.
I expect LLMs to produce randomly varying output. Maybe it's the thousands of hours I spent doing monte carlo simulations for my PhD.
> This is one of the best arguments against using LLMs I've seen.
> It reduces to the classic argument- at the point where you've described a problem and solution in sufficient detail to be confident in the results, you've invented a programming language.
I'm not an LLM true believer, but I use codex for various small tasks and it often (not always) does a thoroughly decent job. Yesterday I gave it a pretty vague request to set up a new Home Assistant dashboard and it handled it just fine--I told it what I wanted to see but it figured out itself which helper variables it would need to set up to realize that vision and wrote all the config for it.
I probably could have done it in 15 minutes if I was familiar with Home Assistant's yaml configuration schema and all, but I'm not so it probably would have taken me closer to an hour. Asking codex took me 30 seconds and it did just fine.
I am skeptical that LLM's are going to kill all white collar jobs or whatever anytime soon. Not being able to truly learn things is an issue. Reality has a surprising amount of detail[1], and while codex does well at things like writing Home Assistant configs and setting up a Minecraft server, where there are thousands of examples online of how to do it, when I've asked it to do some more esoteric things it has sometimes failed spectacularly. I don't think having the LLM keep notes and then read them back (filling up the context window) is a real solution here.
> It's not surprising that an LLM would produce different specs for the same work on different runs
This is what I don't understand: AI is a computer program with its own data. If we give the same input to that computer program every time, why does it produce different outputs every time? Or does the input include LLM data + our prompt + some random data that computer program picks from its Internet search?
LLMs have a temperature parameter. At zero temperature they are deterministic: they always choose the most likely next token at each step based on what came before and the model weights, and they will always generate the same output given the same input.
As you raise the temperature they will start (pseudo)randomly choosing tokens other than the single most likely token (though that one will still be the most likely to be chosen). It turns out this is almost always better than zero temperature, which has a tendency to get caught in repetitive loops. I imagine all the frontier labs have spent thousands (millions?) of CPU hours tuning the temperature parameters on their models for optimal performance.
> LLMs have a temperature parameter. At zero temperature they are deterministic: they always choose the most likely next token at each step based on what came before and the model weights, and they will always generate the same output given the same input.
"A value proportional to the reciprocal of β is sometimes referred to as the temperature: β = 1/kT, where k is typically 1 or the Boltzmann constant and T is the temperature. A higher temperature results in a more uniform output distribution (i.e. with higher entropy; it is "more random"), while a lower temperature results in a sharper output distribution, with one value dominating."
"Temperature" in the context of softmax does not change a "winning" token, it changes how much probable (in the sense of softmax distribution) winning token will be. If the winning token is "New York", it will be a winner with temperature close to 0 and with temperature of 1e9.
The actual selection of the random token is done separately by using inputs outside of the softmax distribution, for example, by using random number generator. I believe most of LLM configs have a seed for the random number generator.
More than that, generation of code in most programming languages is done with the more guardrails such as beam search guided by schema, syntax and semantics.
But those differences fall within a band of generally accepted results don’t they? And the cost to throw the code away and reimplement is low now. So maybe it doesn’t really matter if the implementation is perfect or identical.
That being said I agree people trust AI too much. Especially people with less experience. It’s easy to forget the models are mirrors of we are as the drivers of the input context not mentors that will guide us to best practices reliably.
I strongly believe you don’t need to call another model for that. The same model can do result fine. Just not as part of the same context.
I mean that if you ask codex on gpt 5.5 to submit to a plan reviewer subagent that uses gpt5.5, this is enough to have a very good reviewing and reassessment of the plan.
My hypothesis is that it’s even better than opus.
The reason why submitting the product of one LLM to another to review is that you need a fresh trajectory. The previous context might have “guided” the planer into some bias. Removing the context is enough to break free from that trajectory and start fresh.
>We're working on a large Rust codebase, heavily assisted development with Claude and Codex, and one critical workflow is after you have written a spec, have the other LLM critique it thoroughly.
I do this with other languages, too, not just Rust. Thing is, you have to put a hard stop at some point because the models will always find something to nitpick.
I love these niche sites! my friend recently started this for Tinned Fish (absolutely and solely for the love of the fish and with no plans to monetize.) He loves that a few random people will rank hundreds of tins. http://tinventory.co/
i tried to stop using y'all when i got my first job at MSFT, having grown up in the South; then 10 years later I realized it's perfect for Corporate America given it's gender neutral
re: binary attestation: "Whether the server rejects that outright or just logs it is an open question"
...what we did at Snap was just wait for 8-24 hours before acting on a signal, so as not to provide an oracle to attackers. Much harder to figure out what you did that caused the system to eventually block your account if it doesn't happen in real-time.
(Snap's binary attestation is at least a decade ahead of this, fwiw)
I'm literally guest lecturing at a Harvard class tomorrow on systemic failures in decision making, using the Columbia and Challenger disasters as case studies, and changed my slides last night to include Artemis II because it could literally happen again.
This broken safety culture has been around since the beginning of the Shuttle program.
In 1980, Gregg Easterbrook published "Goodbye, Columbia" in The Washington Monthly [1], warning that NASA's "success-oriented planning" and political pressure were creating the conditions for catastrophe. He essentially predicted Columbia's heat shield failures in the article 1 year before the first flight.
Challenger in 1986, and the Rogers Commission identified hierarchy, communication failures, and management overriding engineering judgment.
Then Columbia happened in 2003. The CAIB found NASA had not implemented the 1986 recommendations [2].
Now Charles Camarda (who flew the first shuttle mission after Columbia and is literally a heat shield expert!) is saying it's happening again.
> This broken safety culture has been around since the beginning of the Shuttle program.
It's broken everywhere. I have worked in some dysfunctional shops and the problem I see time and time again is the people who make it into management are often egoists who don't care about anything other than the financial compensation and clout the job titles bestows upon them. That or they think management is the same as being a shotgun toting sheriff overseeing a chain gang working in the summer heat in the deep south.
I've worked with managers who would argue with you even if they knew they were wrong because they were incapable of accepting humiliation. I worked with managers who were wall flowers so afraid of confrontation or negative emotions that they covered up every issue they could in order to avoid any potential negative interaction with their superiors. That manager was also bullied by other managers and even some employees.
A lot of it is ego along with a heavy dose of machismo depending. I've seen managers let safety go right down the tubes because "don't be a such a pussy." It's a bad culture that has to go away.
A simplistic answer would be to ensure that incentives are aligned with safety and success. Then that leads to the evergreen problem of Goodhart’s Law (when a measure becomes a target, it ceases to be a good measure).
Even if it can't ever be truly fixed, at least recognizing the issues and shining daylight on decisions for some form of accountability should be a base-level approach.
Agreed. My point was that it will forever be some kind of moving target and to expect a policy framework to guarantee "good behavior" is a mistake.
I emphatically believe that understanding the incentives of all the players is paramount because that is what will ultimately determine their behavior.
It would be cool if there were ways to have a "Game Theory Toolkit" that could be plugged into an organizations communications that could automate the defining and detecting of those unwanted behaviors.
Doesn't work. Remember the Titanic? Remember the British airship R101 (the "government" ship)? Both had their designers and higher ups perish in the subsequent maiden voyage disasters, right along with many/most of the innocent passengers.
> people who make it into management are often egoists
> they were incapable of accepting humiliation
I agree mostly but here is a different take on it: I think these are normal human feelings and behaviors - not the best of us, but not unusual either. If we want to get good things done, we need to work with and through human nature. Power corrupts everyone and shame is generally the most painful thing for humans.
Putting people in a position where they need to treat their power with absolute humility or accept humiliation (and a major blow to their careers) in order to do the right thing is going to fail 99% of the time. (I'm not saying people can't do those things and that we shouldn't work hard and aspire to them, but it's not going to happen reliably with any but a few people.) That expectation itself is a culture, organizational and managerial failure. If you see a system in which so many fail, then the problem is the system.
And when I say 'managerial' failure, I include leadership by everyone and also 'managing up'. We're all responsible for and agents of the team's results, and whatever our role we need to prevent those situations. One important tactic is to anticipate that problem and get ahead of it, putting the team in a position where the risk is proactively addressed and/or they have the flexibility to change course without 'humiliation'. We're all responsible for the team's culture.
I think many blaming others underestimate their own human nature, the effect of power on them and their willingness to endure things like humiliation. Rather than criticising others, I keep my attention on the one in the mirror and on strategies to avoid situations equally dangerous to my own character; otherwise I'll end up doing the same very human things.
EDIT: While I still agree with everything I wrote above, there is an exceptional cultural problem here, one which you'll recognize and which is common to many SV leaders, the Trump administration, and others you're familiar with (and which needs a name ...). From the document referenced in the OP by "heat shield expert and Shuttle astronaut Charles Camarda, the former Director of Engineering at Johnson Space Center."
"Instead, the meeting started with his [Jared Isaacman, the new NASA Administrator's] declaration that the decision was final. We would launch Artemis II with a crew, even though the uncrewed Artemis I mission around the Moon returned with a seriously damaged heat shield, a failure in my opinion. I was not going to be allowed to present my position on why the decision was flawed. Instead, the public would hear, through the two reporters allowed to attend, the Artemis Program narrative, only one side of the story. They would be bombarded with technical information which they would have very little time to understand ...
Jared could claim transparency because the only thermal protection expert and public dissenter, me, was present. ...
I was allowed only one-day to review some of the technical documents which were not open to the public and which were classified Controlled Unclassified Information/International Traffic and Arms Regulations (CUI/ITAR) prior to the Jan.8th meeting. ..."
> Putting people in a position where they need to treat their power with absolute humility or accept humiliation (and a major blow to their careers) in order to do the right thing is going to fail 99% of the time.
I don't know... we select those people. Usually not for their ability to treat their power with humility, though.
That's my argument in favour of quotas (e.g. for women): the way we select people in power now, we tend to have white old males who have the kind of relationship we know with power.
By deciding to select someone different (e.g. a woman), we may realise that not all humans are... well white old males. Not that we should select someone incompetent! But when we put someone in a position of power, I am convinced that many competitors are competent. We just tend to chose "the most competent" (with some definition of "the most"), which may not mean anything. For those positions, maybe it's more that either you are competent, or you are not.
Say from all the "competent" candidates, we systematically selected women for a while. We would end up with profiles that are not "white old males", and we may realise that it works just as well. Or even better. And that maybe some humans can treat power with humility.
And if that got us to accept that those are desirable traits for people in power, it may serve men as well: plenty of men are generally not selected for positions of power. Forcing us to realise this by having quotas of minorities (say women) may actually help "white old males who can treat their power with humility" get recognised eventually.
I think we already have quotas and affirmative action for white (Christian) males. Not long ago and maybe still true, more Fortune 500 CEOs were named John than were women. Though the policy is sometimes unstated (not always, especially in private, and the current administration is pretty clear about it), I think the data on the outcomes is overwhelming and undeniable.
I think also that gender or skin color doesn't make anyone more or less susceptible to these problems. We will find much better leaders by broadening our search beyond ~25-30% of the population, and we may find them better able to handle the challenges of power, but it won't be because of their gender or skin color.
I didn't mean that it was because of gender or skin colour. What I meant, really, is that we select the people who get power in some way. And then we complain that people in positions of power are like that.
My point was that there are probably a lot of "white old males" that just do not apply for positions of power, because they have learned all their life that they don't have the profile we usually select. And those may actually have qualities (like humility) that would make them better in those positions.
Now, it's difficult to say "this time, we will try to select a white old male who has humility, but first we have to convince him to apply even though he has learned his all career that it's not worth applying". But saying "let's try to hire a woman instead" may be a proxy to that. Sure, some women can be exactly like those people we already select (maybe Margaret Thatcher had a profile of the typical "old white male" that usually got into a position of power).
But I do believe that most women or people from minorities have a profile different from the typical "old white males" who are selected. So it may be a good proxy for "trying a different profile". The idea being that by trying a different profile, we may realise that it actually makes better leaders, and eventually the white old males who do have humility may get selected as well.
I get it. There are a couple of things I think about:
First, things weren't like this even 10 years ago. Humility in power had long been a fundamental American moral before that: All are created equal, the rejection of aristocracy, and the foundation of freedom and self-determination; freedom of religion and speech - nobody else should tell someone what to say or their religion; George Washington refusing to accept more power or a third term; the humility of leaders like Lincoln and Eisenhower and King; the supremacy of civilians over the military; the early New England culture and Henry David Thoreau; the required public humility of almost every president before Trump - nobody talked or behaved like him. I read a ~10 year old New Yorker article recently about the public humility of many Wall Street leaders in the 1980s, at least, who wore more modest clothes, built their houses with low fences, etc. The pioneers of the Internet who believed in openness and end-user control. I read something old about SV - from the early 2000s I think - a conference of CEOs, etc, and someone asked who flew in their own jet; the speaker remarked how embarassed many were to raise their hands.
The good news is, that moral existed for centuries and is part of the American fabric. We just need to be reminded of who we are and of what really made America great. (Yes, there were endless exceptions to it - in every person is good and bad, pride and humility - but today narcissism is embraced.)
> I didn't mean that it was because of gender or skin colour.
> I do believe that most women or people from minorities have a profile different from the typical "old white males" who are selected.
I don't see how those things reconcile. I think people in each group are, on average, just as likely to be corrupted by power, etc.
It's the power that does it; it's the most powerful drug, the Ring - until they have power, you don't know reliably how they will respond. Fewer non-hetero-males and minorities have power, so it may seem like they aren't corrupted by it.
'If you want to see someone's character, don't give them hardship, give them power.' The American elite are failing the country and the world.
For one, and once again: those people spend most of their life knowing they won't access a position of power. White males who don't have the profile of the "dominant white males" are in a different position: they don't grow up knowing it, they have to realise eventually that they are just the kind of white males who gets power. And if they do, the risk is that they fall back to a whole life in a society that did not actively tell them that it wasn't their place, so that's still different from women or minorities.
> Humility in power had long been a fundamental American moral before that
It's not only humility, I thought we were using it as a way to say "the qualities that would make a great leader for the people".
And #metoo showed us pretty clearly that the white males in power decades ago were so often abusive that the only thing we can say is "well but it was a different time".
Today, if I look around me, those who get in positions of power are more often than not toxic. What they are good at is winning against their competitors, not building much. Once they have the power, they can attribute to themselves whatever was built by the people "below" them.
Sounds weird to me. Is it a "just saying - my observation" kind of take, or is it statistics? It cannot be both, can it?
Obviously we don't have any statistics about that, first because we don't have any measure that can say who the "most competent" is when it comes to "being in a position of power". The only measure we have is that the person in power was competent enough to get in power, which doesn't mean they are not toxic (very often, they are).
>Sounds weird to me. Is it a "just saying - my observation" kind of take, or is it statistics? It cannot be both, can it?
No, this is not any official statistics. It's personal observation. Just like I can conclude that statistically most men are taller than women based on personal observation. ( you can remove the word statistic if I sound confusing)
( At this point I generally like to ask the person who I am conversing with - what is your real world experience with complex technical projects? Or alternatively do you exchange notes with people who manage complex technical projects.)
You say that "white old males" are "statistically the most competent" (without statistics, just "because you see it"), and also "most humble".
And you seem to genuinely believe it. Well I don't, at all.
> what is your real world experience with complex technical projects?
My real world experience with complex technical projects shows completely different "gut feeling statistics". My real world experience with complex technical projects is that those white males who are particularly good at getting in positions of power are generally incompetent at doing anything other than getting in positions of power. More: they are often counter-productive, at least regularly toxic, sometimes downright dangerous. And they systematically believe that they are good people and that everybody loves them, even though my experience being part of "the people" is that it's usually very, very wrong.
So that makes at least one point where they are statistically (from what I see, no actual statistics) incompetent: they don't realise that what they see reflects their position of power (people act as if they respected them) and not reality (people act completely differently when they are safe to do so, e.g. when drinking beers in a safe environment).
> Calling a system problematic is, essentially saying no one is responsible.
That's a great and essential point.
I think if we deal with reality, the correlation between system and human behavior is inescapable. And of course leaders and managers have a strong influence on the people they lead/manage (and vice versa to a lesser degree), and peers have a strong influence on each other. Otherwise, leaders might as well not exist. We are social creatures.
At the same time, each of us is fully responsible for what we individually do.
It can be a hard circle to square, and there it becomes a vivd issue at times: If the general orders something immoral or illegal, the colonel passes the order to the captain, and the seargent takes a squad to do it and the private carries it out, who is responsible and how much?
All of them are responsible, of course. But how much? Do we hold the 18 year old private as responsible as their officer, the captain? Do we hold the young officer as responsible as an senior one?
My point is, that for the private, we do offload some responsibility to the system. For the general, much less so. (Or we should; often the general and others use their influence to get out of it and the captain or private is blamed.)
The most frustrating part of the whole thing is that when you read Charles Camarda’s thoughts after his meeting with NASA in January, it could have been written in 1986 or in 2003.
It’s pretty clear at this point that the shuttle was already broken at design. But seeing the same powder keg of safety/budget/immovable time constraints applied to a totally different platform decades in the future feels like sitting through a bad movie for the third time.
What strikes is not the systemic failures. But the intense culture of secrecy.
Reports are heavily redacted. They aren't shared. Failures aren't acknowledged. Engineering models aren't released. That secrecy eventually causes what we see today.
It’s fundamentally a human coordination problem that cannot be solved
The more populated and complex an organization gets it becomes impossible to maintain a singular value vector (get these people around the moon safely)
Everyone finds meta vectors (keep my job, reduce my own accountability) that maintain their own individual stability, such that if the whole thing fails they won’t feel liable
It can't be solved 100%, but it can be _mostly_ solved with systemic buy-in to the safety culture. Commercial aviation is a great example IMO.
We've spent the last several decades making sure that every single person trained to participate in commercial aviation (maintenance, pilots, attendants, ATC, ground crew) knows their role in the safety culture, and that each of them not only has the power but the _responsibility_ to act to prevent possible accidents.
The Swiss Cheese Model [1] does a great job of illustrating this principle and imparting the importance of each person's role in safety culture.
A big missing piece with manned space flight IMO is the lack of decision-making authority granted to lower staff. A junior pilot acting as first officer on their very first commercial flight with real passengers has the authority to call a go-around even if a seasoned Captain is flying the plane. AFAIK no such 'anyone can call a no-go' exists within NASA.
Safety culture requires the ability to learn from mistakes, the capability to ground planes (without that turning into a political problem), and someone to foot the bill. (Which did not always happen, Boeing MCAS with a SPoF AoA sensor without retraining. A chain of cost-cutting decisions. And of course there were usual problems with market distorting subsidies to both Boeing and Airbus.)
NASA's missions are way too big, because the science payloads are unique, so they "can't do" launch early, launch often. And then things sit in storage for years, waiting for budget. (And manned flights are in an even worse situation of course, because they are two-way.)
And there's too much sequential dependency in the marquee projects (without enough slack to be able to absorb problems if some earlier dependent outcome is unfavorable), or in other words because of time and cost constraints the projects did not include enough proper development, testing, verification.
NASA is doing too many things, and too much of it is politics. It should be more like a grant organization, rewarding cost-efficient scientific (and engineering) progress, in a specific broad area ("spaaace!"), like the NIH (but hopefully not like the NIH).
But SpaceX launches manned missions, with a perfect safety record so far, plus a fantastic success rate for their unmanned Falcon flights. They "launch early, launch often" for their test flights.
The main reason NASA can't do that with Artemis is that every SLS launch costs at least $2 billion.
> without enough slack to be able to absorb problems if some earlier dependent outcome is unfavorable
It's strange because unmanned mission are heavy in the "under promise and over deliver" territory. They may say something like "we are sending a car to Mars for a month", but everything is over engineered to last for a year. Then it miraculously work for eleven month and it's a huge success.
I guess the conclusion is that the manned missions since the Moon landing were for Cold War reasons. (With that kind of mentality.) And when that ended they made even less sense.
For example when they had to go up to refill the wiper fluid on the Hubble in '93 it was no biggie, because as shitty as the shuttle was, it was at least reuse-minded, and there were regular missions (and budget for that). The ISS assembly coasted on the Clinton era budget surplus, but then it was evident that prancing in LEO is great for hijacking Soviet satellites, but not much else.
And compared to the Hubble the JWST was a classic Eminem mission (one shot, one opportunity ... no, wait! that's on Mars!), even if it took 5-10 more years than planned, it seems it was completely worth it.
No, CRM is a disaster you clearly are not in aviation. The reliability in aviation came from incredibly strict regulation and engineering improvements, NOT from structural alignment of parties. They were forced to get safer by the government if you can believe there was a time where the government did anything useful at all.
I could go off for literally hours on this topic but suffice to say I’ve done an unbelievable amount of CRM as an officer in the United States Air Force who flew on and executed 100s of combat missions in Iraq
My friends from Shell 77 are all dead because of CRM failures
Reagan speaks with grandfatherly warmth about the importance of finding a middle ground between reasonable safety regulations and progress. In the same clip, he mentioned not knowing of any group with as little influence on politics as business.
Dog convinces owner to let it off leash. The rhetoric that charmed Americans into letting down their guard, in miniature.
Yes and... NASA space programs (doing rare, unknown things) are different than commercial aviation (doing a frequent, known thing with high safety). Best be careful applying solutions from the latter to the former.
Layering additional safety layers on top of a fundamentally misaligned organization process also generally balloons costs and delivery timelines (see: NASA).
The smarter play is to better align all stakeholders' incentives, from the top (including the president and Congress) to the bottom, to the desired outcome.
Right now most parties are working towards very different goals.
100% agree, and I definitely see this in the tech industry and it all begins and ends with psychological safety. Right now there’s job pressure in tech which creates this toxic sense coming from management that they can fire any one at any time because they don’t like you. It essentially fosters this culture to not rock the boat or “piss off the wrong person.” The result is, you keep your mouth shut or significantly risk being penalized on your annual performance review. Add inflation and the ever-rising cost of living. For an individual contributor or even front line management, the choice is very clear. This is obviously a recipe for catastrophe when you’re dealing with human lives.
When you’re a rocket scientist at NASA, you also have relatively few alternatives other than SpaceX or Boeing.
The problem is that it is treated as if people's jobs are to do X. What happens when someone says given the problem constraints it's impossible to safely do X? The naysayers get replaced, X gets done anyway.
Actual safety only comes when there is an external agency who monitors safety and accomplishing X is not part of their objective.
But in the 80s I guess there was the pressure to one-up the Soviets, so everything had to be done yesterday, but Artemis has existed most of my adult life at various levels of maturity (Orion and its predecessors certainly did), and considering its been more time spent between that famous Kennedy speech and the actual Moon landing (where there was apparently no issue with safety culture).
Considering how much humanity has allegedly advanced since then, I don't understand what are we gaining thats caused us to have to abandon safety.
Although not especially “current,” Normal Accidents: Living with High-Risk Technologies is a 1984 book by Yale sociologist Charles Perrow, which analyses complex systems from a sociological perspective. Perrow argues that multiple and unexpected failures are built into society's complex and tightly coupled systems, and that accidents are unavoidable and cannot be designed around. Several historical disasters are analysed. I read a newer edition published in 1999, and the author had added a chapter on Chernobyl, which turned out to be a textbook example of some Perrow’s theory (in particular, that adding fail-safes also adds complexity, thus not necessarily making for any more safety. The Chernobyl disaster was precipitated at least in part, because they were on a tight schedule to test a fail-safe system.) The book is fascinating and a good page turner, hard to put down.
Perrow’s book is best combined with a reading of The Doomsday Machine: Confessions of a Nuclear War Planner, by Daniel Ellsberg.
I'm a retired neurosurgical anesthesiologist (38 years in practice). I read Perrow's book several years after it was published. I was struck by how relevant his points of failure were to the practice of anesthesiology, the concept of the danger of tight coupling. I referred to this book over subsequent decades in my presentations on Grand Rounds, but to my knowledge none of the residents or other attendings ever read it.
Other books I’ve much enjoyed, when your interest is in structural or other failures:
Why Buildings Fall Down: How Structures Fail by Matthys Levy and Mario Salvadori, a wide ranging history of structural failures of various kinds, and their causes.
Ignition!: An Informal History of Liquid Rocket Propellants by John Drury Clark, which is a personal memoir from a senior researcher with many decades experience developing rocket fuels - he is the proverbial Rocket Scientist. Most interesting, and amusing (in a morbid way), is the quite different culture of safety “back in the day” of this somewhat esoteric engineering/chemistry field.
Sure. Even a history of safety success contributes to this. We haven't had an accident in 3000 days, what was dangerous about this job again? Also what's this stupid policy for anyway, I've never seen anybody even come close to (non-dangerous-sounding fate) while working here.
But probably the policy is in place because it used to happen before the policy was in place. It's just not obvious to people who have never seen the consequences before.
Complacency kills! It's why it's usually the old farmers that die in stupid ways.
I'm also reminded of the Yale machine shop safety supervisor who died by getting herself wound around a lathe spindle. Working alone, late at night on powerful rotating machinery wearing loose clothing.
>Michele Dufault, a 22-year-old physics and astronomy major from Scituate, Massachusetts, was asphyxiated after her hair caught in a lathe in a Sterling Chemistry Laboratory machine shop, where she was working by herself in violation of the existing safety rules.
Hmmm. I was positive that at the time the report came out, they said that she was actually one of the people assigned to monitor machine shop users for safety issues. Maybe that got confused with her taking the advanced safety classes.
Inviting Disaster: Lessons From the Edge of Technology was one of the texts for an aerospace class I didn't take but friends did, but honestly you can just read the book.
There are lots of frameworks for teaching safety and programs for compliance and such but they are far too easy to cargo cult if you don't appreciate safety and the need for safety culture and UNDERSTAND what failures look like.
And when you really understand the need and how significant failures happened... "state of the art" tools and practices take a back seat, they can be useful but they're just tools. What you need is people developing the appropriate vision, and with that the right things tend to follow.
The STPA and CAST handbooks are available for free from the MIT Partnership for Systems Approaches to Safety and Security website. They are phenomenal.
The word "government" doesn't magically erase all the same individual & institutional incentives, ambitions, biases, & flaws that exist elsewhere.
And sometimes, the extant magical belief that "government" is different & immune lets those same human factors be ignored until they feed bigger, slower disasters that everyone is afraid to admit, because (ostensibly) "we all did this together".
The role of for-profit companies and 'shareholder' value in explaining corporate bad behavior is highly overstated. The only profit that matters is the one at the individual level (i.e., compensation, which is a form of profit, for the individual).
A government employee or a private corporation doesn't matter. To the actual humans, they are the same, in that each provides a particular compensation, tied to their decisions.
Is "Why not pay people to do their jobs correctly?" a way of voicing frustration with massive gov't incompetence? Or a way of saying that organizational incompetence is top-down?
Both, but its also a genuine question. If you look at the Boeing 737 max for example they did it clearly with the idea of profit. I can understand the reasoning (it is still criminal and people should have gone to jail imo).
Why you would pay something like that with tax money is beyond my understanding.
Because the grifters are running the show. The point is not to fly to orbit/moon/mars/whatever, but shovel taxpayer money to politically well connected large aerospace contractors.
Those astronauts don’t have anyone that loves them at home because no way in hell would any of my loved ones let me be a sacrificial turkey in a fully automated oven.
They do, but they are not in a position to judge. Same way as the Challenger crew despite NASA and astronauts saying, "we would not fly we would not believe to be safe enough".
It seems that in modern times, humans focus on safety almost to the exclusion of everything else. As much as the more traditional salutations "godspeed" or "have a nice day", we're even more likely to hear "drive safe" or "have a safe trip" or "be safe".
We're very nearly paralyzed by insisting that everything must be maximally safe. Surely you've heard the mantra "...if it saves just one life...".
The optimal amount of tragedy is not zero. It's correct that we should accept some risk. We just need to be up-front and recognize what the safety margins really are.
> We're very nearly paralyzed by insisting that everything must be maximally safe.
Are we? People saying "have a safe trip" is pretty weak evidence.
The counter evidence is just about everything else going on, at least in the US. Relaxed worker safety standards, weakened environmental protections, and generally moving as fast as possible.
We have 4 kids. Before we had our 3rd, we needed to buy a new vehicle solely because we couldn't fit 3 car seats into the back of our old car. And when traveling with kids, carrying 4 gigantic car seats plus your other luggage is not exactly as easy as you might think! It essentially rules out solo parent travel with all 4 kids. Transferring car seats between two cars, or installing car seats in a taxi, is a serious pain.
Furthermore, the evidence that car seats actually benefit safety is significantly less robust than you might think. The "mountains of evidence" that do exist for things like 70% reductions in fatalities, bizarrely enough, generally compare the rate of fatalities for car seats vs completely unrestrained kids. When you compare the rate of fatalities in car seats to kids wearing adult seat belts, the bulk of the evidence suggests essentially no difference. Fatalities happen when the forces involved are catastrophic and sadly a car seat doesn't help much for kids over 2.
Even a back of the envelope comparison makes this extremely plausible: car crash fatalities for kids 9-12 have declined by 72% from 1978-2017. If car seats and car seat laws save significant numbers of lives, you'd expect that the fatality rate for kids 0-8, who are generally in car seats, to have decreased much more. But it hasn't, it declined by 73% over the same period.
Now, car seats and boosters do seem to moderately reduce non-fatal injuries - huge spread of estimates here, most clustering around 10-25%. It's reasonable for most people to use car seats or boosters most of the time based on this alone, IMO, especially for young kids. But do they justify a mandate? IMO: no. Absolutely not.
Worth mentioning that mandates probably do succeed in one thing: they reduce the number of children born at all by at least 57x more than they prevent child fatalities. Roughly 8,000 kids per year, 145,000 kids since 1980. That's with the (unlikely, as discussed above) assumption that car seats do in fact save significant numbers of lives. But it's also entirely possible that they've prevented hundreds of thousands of kids from being born, somewhat reduced the nonfatal injury rate, and saved essentially no lives.
- https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3665046 (car seat mandates "led to a permanent reduction of approximately 8,000 births in the same year, and 145,000 fewer births since 1980, with 90% of this decline being since 2000")
Note that both the 45% and 59% estimate for injury reduction and the 28% estimate for fatality reduction all come from one research group using a proprietary data set. Everything that's independently reproducible points towards small or zero effect on fatalities and modest effects on injuries.
Yep. As I understand it, there's no statistical difference between a car seat and a seat belt over age two. We've known this for a very long time, too. But it's easy to make an emotional appeal to do something to make the world safer for children, even if what you're doing doesn't make any difference.
In the hypothetical scenario where car seats have only downsides, then of course I’m against a mandate.
There is a difference between cherry picking studies that back up your view point and how medical experts set policy though.
Experts review all of the data, and ignore outliers like a paper published in a law journal that suggests car seats are the primary reason families have shrunk from having three to two kids since the 80’s
You’ve funnily proven the point of how willing people are to put immense burdens on others in the name of safety.
There is a non-zero amount of deaths the car seat law would prevent. The burden will discourage larger families and will contribute to population decline far larger than the lives saved.
You’re not only arguing for it, you’re doing it in a way as if preventing death is such an obvious single dimension to optimize that you’re calling people irrational because they are against something that reduces fatalities.
Your same argument is what leads to prohibition and a long list of other things that suck the color out of life in the interest of “safety”.
In this conversation, you have repeatedly referred to "all of the data" and "mountains of data," yet you have posted none. Meanwhile I have posted every major study on both sides of the debate! Your argument seems to be that:
- the experts have told people to use car seats
- experts wisely base policy on "all of the data"
- therefore, "all of the data" must support the claim that car seats save lives
If we're going to discuss the question of whether experts have set policy well or poorly in a particular case, then such a strong prior on "experts always set policy well and based on the best available evidence" kind of assumes the conclusion, doesn't it?
Experts almost always set policy better than non-experts doing their own research. Especially on complex topics.
There is no point in two amateurs arguing over a topic they don't understand.
All I can do is refer to the publicly available reasoning and studies of experts, which have evidence and conclusions opposite of the amateur conclusion above.
Child car seat regulations are state laws passed by state politicians. They are not experts in any sense of the word, and generally don't bother with evidence or studies when creating said laws.
Risk tolerance is a value judgment, not an empirical fact waiting to be discovered.
Competent experts could tell you how much safer you would be if you wore a helmet to drive your car. They can't tell you how much you should value that extra bit of safety.
Personal risk is a value judgement. The government steps in when your decisions impact the life of an unconsenting third party, like a child.
Can you point to a single competent expert who recommends the average driver should wear a helmet while driving? Again, there is a difference between one study showing helmets reduce injury in crashes, and an expert reviewing the problem as a whole and making a recommendation.
The evidence that car seats save lives is significantly weaker that you probably believe, as I detailed in another comment in this thread. But look: even if car seats make sense for a typical 5 year old on a typical drive in their typical car (which is a higher evidentiary burden than you might think), a mandate imposes a huge logistical tax that makes many normal things completely infeasible or impractical:
- travel with many kids (nope, physically can't carry 4 car seats plus luggage)
- using a taxi, e.g. to go see a movie (nope, can't carry a car seat into the theater)
- carpooling with other families (I'll drive them, you pick up? Nope, we'd have to shuffle car seats around.)
- rides with grandparents or other family members (sorry, we'd have to deliver the car seat to them first)
- splitting kids between two vehicles for errands (let's spend 10m wrestling car seats from one car to the other first)
The whole texture of independent childhood is altered by car seat mandates! Everything gets filtered through "is there a car seat available?". If you haven't experienced this, it's hard to describe - and I think it's absolutely a case where tradeoffs like "how will this affect quality of life?" are completely overridden because "well, if it just saves one life..."
> Car seats and booster seats significantly reduce the risk of fatal injury in crashes by 71% for infants and 54% for toddlers (1-4 years old), saving over 11,000 lives in the US since 1975
> Booster seats reduce the risk of serious injury for children aged 4-8 by 45% compared to seatbelts alone.
It's from the AI summary because it was the most quotable but the articles I found say pretty much the same thing. Seems pretty solid to me.
> If you haven't experienced this, it's hard to describe - and I think it's absolutely a case where tradeoffs like "how will this affect quality of life?" are completely overridden because "well, if it just saves one life..."
If you haven't experienced your children dying unnecessarily because it was inconvenient to make them safe it's hard to describe..
What articles did you find, exactly? What primary evidence are they basing their claims on? Many of the numbers you'll find with a google search are unclear about what they're comparing to - I believe both of the fatality numbers above (71% and 54%) are relative to completely unrestrained kids, which is not the relevant comparison.
The 45% number I specifically discuss in the other comment, but every independently reproducible study using publicly available data has found much smaller effects, around 10-25% for minor injuries and no statistically significant difference in severe injuries.
To be clear, I'm not saying "don't use car seats," I'm saying that the evidence doesn't support mandating them through age 8 (or 12!).
Our kids would be much safer if we drove everywhere at 15mph - less convenient, but it would prevent many unnecessary deaths. Unfortunately, it is impossible to do anything in the world without risk. So we're forced to balance convenience against safety every day, whether we want to admit it to ourselves or not.
It notes this, which might be pertinent to your comment regarding how the overall statistics don't show the trends you expect:
> A NHTSA study found that while most parents and caregivers believe they know how to correctly install their car seats, about half (46%) have installed their child’s car seat incorrectly.
> Children in booster seats in the back seat are 45% less likely to be injured in a crash than children *using a seat belt alone*.
That's about as much effort as I'm willing to put into this conversation. I'll finish off by saying I'm not American and these rules exist outside the US as well - I have a hard time believing so many countries would separately implement this (or similar) mandate if it was as unfounded as you claim.
But sure everything would be better if any moron was allowed to decide how to keep their own kids safe.
Yes, I think that we'd all be better off if every person was allowed to have their own personal values, deciding what's more important to themSELVES, rather than piling on and trying to force every one into a one-size-fits-all solution.
For my part, I'd much rather have people wishing me "have a rich and fulfilling life" rather than "be timid and careful to maximize your time even if it's boring and unrewarding".
Sure, you can disagree with my priorities, but that's the whole point. We should each be able to have our own priorities.
Do you think it’s okay for people to indoctrinate their own children with religion and other political views?
Far more harm comes from that than tail risk elimination mandating car seats between 8 and 12 years.
Would you be willing to make all new parents submit to frequent breathalyzers during pregnancy and after birth? Drinking is a massive factor in infant mortality at birth and SIDS.
I don't see a reasonable way to avoid parents imposing their beliefs on their kids so this point you're trying to make is pretty weak man. You're comparing a problem with a very clear solution vs a problem with no clear solution. You wanna take the kids away from all religious people and all people with differing political views? Good luck with that.
Should all parents submit to frequent breathalyzers? Tell me, how many parents, as a fraction of all parents, drink irresponsibly to the point where it significantly endangers their children?
Now compare that number to the fraction of parents who drive their kids around in cars. You're grasping at straws comparing apples and oranges.
>Tell me, how many parents, as a fraction of all parents, drink irresponsibly to the point where it significantly endangers their children?
More than the ones that get harmed between 8 and 12 in cars.
Your whole argument seems to be “it would be okay to take kids from their parents if the enforcement was easy”. It’s clear that you’re in the camp of sacrificing pretty much any liberty in the name of safety.
Holy crap lol, in what universe is that my argument? That was my response to your posed problem. You suggested that it was bad that parents indoctrinate their children - I didn't say we should take their children away I pointed out that taking their children away is the only way to stop them from indoctrinating them. I made it extremely clear that I did not consider this a viable option lol.
Holy crap man how braindead are you? I'm out of this conversation there's clearly no point talking to you. Wow.
The evidence on car seats is extremely weak and they prevent only a handful of injuries. You'd be better off redesigning roads or having more collision protection systems in cars. As self-driving cars get better to the point where they can communicate and eliminate many human errors, there's probably no need for car seats at all. In many situations they make things more dangerous, not less.
Every safety measure faces a question of whether the resources allocated to it are an efficient means of achieving that reduction in risk.
To GP's point, we probably can't prevent people from crashing altogether, but we currently have a road system designed to sacrifice safety on the altar of throughput [0]. How many more or fewer kids (or just people) would die if governments allocated the resources to making roads safer that they currently mandate their citizens use on car seats?
> I don't need a guard on my table saw if I don't stick my thumb in it. Don't need a helmet if I don't fall off of my bike.
Do you think the guard on your table saw makes you safer than training and experience using the saw safely? There are always limited resources and multiple routes to safety, so we shouldn't assume any given safety measure is the best use of those resources (especially in consideration of second-order effects).
Thanks to risk compensation, making things "safer" doesn't necessarily improve safety. What are the odds that people drive their kids around more (increasing their risk) because having kids in car-seats reduces the perceived risk? How many of those people do you think can point at what the reduction in risk due to car seat use is [0], such that they compensate that risk "rationally"?
[0] Hint: As our sibling conversation shows, that's a non-trivial question.
Considering that driving (at least in the US) is a relatively unsafe means of travel compared to the alternatives, I can understand imploring someone to drive safe.
Our internal emotional thinking doesn't work very well with probabilities so it is a very common fallacy trying to reduce a probability to zero when it is completely irrational.
I feel like all the responses to your comment sort of prove its point.
As I was reading the post I was wondering along the same lines, if this is different from before. Going to space is an inherently risky activity. It's always going to be easy to write the "this is not safe" think piece, where you can either say "I told you so" or "Whew, thankfully we made it this time!" afterwards. Things like this only happen when you accept some risk and people say "yes" press forward.
All that said, not all risk is equal, and I'm trying to understand if NASA is uniquely dysfunctional now and taking needless, incidental risks.
America has been craving safety since 9/11, and it has made cowards of everybody, so in some sense I would agree.
But taking a risk regarding an unknown or to expand knowledge or actually accomplish something is one thing. Ignoring known and mitigable risks just to save money, save face, meet a deadline or please a bureaucrat is another.
Anyway these clowns even fail your criterion, because by covering up the results of the first launch/experiment, they are not being up front about a risk.
In my opinion this is a top-down, human hierarchy thing. CEOs and agency administrators create and set an organization's culture and expectations.
The irony is that a faulty heat shield is an engineering challenge that real engineers would love to tackle; all you have to do is turn them loose on the problem, let them fix it. They live for that. I find it actually aesthetically offensive that the organization and its culture has instead taught them venal, circumspect careerism, which is cowardice of a different kind.
Maybe not so much "oblivious to safety" as "oblivious to probable risk." We worry to much about low risk events (like airline flights) and don't worry enough about higher risk events (like trips-and-falls, driving a car, poor diet...)
I wouldn’t say humans are oblivious to safety. The Apollo program was very successful as long as you’re not related to Gus Grissom, Ed White or Roger Chaffee. But those three (preventable) deaths aside, Apollo was quite successful and figured out some huge problems.
If you’re interested in a heck of a good read, the Columbia Accident Investigation Report is a good place to start:
It looks at the safety culture in NASA and at how that safety culture ran into budget issues, time pressure and a culture that ‘it’s always been okay’. But people were aware of the problems.
There’s a really frustrating example from Columbia where engineers on the ground badly wanted to inspect the shuttle’s left wing from the ground using ground based telescopes or even observations from telescopes or any other assets. There’s footage available was an email circulated where an engineer all but begged anyone to take a look with anything. That request was not approved - they never looked.
Realistically there’s a point to be made that NASA wasn’t capable of saving those astronauts at that point. But they had a shuttle almost ready to to, they could have jettisoned its science load and possibly had a rescue of some sort available. They never looked though but alarm bells were ringing.
It’s more accurate to say people are highly aware of safety but when you get a bunch of us together, add in cognitive biases and promotion bands we can get stuck in unsafe ruts.
I'd say it's more accurate to say the people who are actually smart work as engineers. Leadership is generally engineers who were better at office politics than engineering, or just business majors etc.
So you have a group of really talented people using their talents to do awesome things, and then you have some useless idiots who are good at kissing the right asses, running the show and taking most of the credit. And that's how you end up killing astronauts, because the useless assholes in charge aren't even competent enough to recognize when they should listen to the brains of their operation. All they care about is looking good to their superiors and hitting some arbitrary deadline they've decided to set for no damn reason etc.
If you're looking for programs where mistakes were not made, Apollo is not the program to choose. I highly recommend visiting Kennedy Space Center some time where they go in-depth on how close it came to never happening after Apollo I. https://en.wikipedia.org/wiki/Apollo_1
That being said, I'm a big proponent of "you can't make ICBM's carrying humans 100% safe", but you sure can try your best.
Us humans do have difficulty with safety. Sometimes we are able to overcome that problem to an extent. Here are some the few examples where humans have done well with safety: FAA commercial airlines, Soyuz, ISS, Shinkansen trains, US Nuclear power post 3 mile island, Vaccines, and the Falcon 9.
i built my own version of this called 'threethings' (per pmarca's essay on the subject of personal productivity). i gave an ec2 claude instance access to a folder that is synced with gdrive so it's easy to get local files to the instance, and gsuite access. i had claude build a flutter app one hour when i couldn't sleep, and gave it a telegram bot account. i talk to it via telegram and it keeps tabs on personal and work emails. it does 'deep work' late at night and sends me a 7am summary of my day. my wife is asking for it now, because it will notice urgent emails first thing in the morning and alert me.
i don't have time to open source it, but it's low key revolutionary having a pretty smart AI looking at my life every day and helping me track the three most important things to do.
Sitch App | Founding Product Manager | NYC-area HYBRID (2-3 days/week in NYC near Union Square) | Full-time
Sitch is a new type of concierge dating app. We're using AI + Humans to ask you questions, get to know you, then help you find love.
We're an AI powered concierge matchmaker for serious daters. Sitch is live in 5 cities and expanding nationally in 2026.
We're philosophically opposed to the dopamine-fueled mini-game doomswiping of 'traditional' dating apps which cynically monetizes desperation. We're monetizing intent!
* We're top-tier VC funded (a16z Speedrun, M13, early Snap angels)
* Looking for a PM who loves building for consumers, loves relationships, is a bit unusual, and is obsessed with building unique experiences only made possible with the new era of LLMs
* 2-3 days/week in person, in Manhattan near Union Square
How are you dodging the cross-concerns of 'people who use the app more are more likely to dump money into it'? At present, customers who pay are more desperate than customers who don't, so income is directly tied to customers who can't find a partner.
probably a longer answer is warranted, but we keep it simple - everyone pays - which essentially eliminates ghosting and self-selects for people who are serious about a search for a partner.
This is because the older films were co-written with Owen Wilson. Once they stopped collaborating, Anderson's later films are unbalanced - they have the whimsical aesthetic, but are too sweet without the bitter piercing wit and clarity of Wilson's writing to make them less cloying (IMHO).
He was going through some major depression and understandably pulled back from the industry. But he brought something very personable and authentic to comedy, and his absence has been palpable.
Many other comedians of the era were too slapstick and over the top for me. I still can't watch a Will Ferrell comedy with any interest.
Owen Wilson has been a fascinating character with a unique yet consistent approach.
Ferrell... Massive comedic turn off for me. He seems like the guy that jumps into a room, interrupts and yells out a joke out of context, then keeps repeating it louder and louder until some polite fake laughter occurs. I feel bad about being this negative about a fellow human being, but his comedic approach sets off a Bully vibe / response in me in anything I've seen him in except Stranger Than Fiction.
In Paris, 90% of the metro transportation system isn't accessible to wheelchairs and strollers. Buses are overcrowded and slow. Who doesn't enjoy to see someone cough on their newborn while fighting for a space for their stoller ?
Buses are perfectly accessible in Paris. They are crowded but acceptably so for a city of 10 millions. It’s not fair to expect the collectivity to accept the externalities of cars so rich people can avoid some slight discomfort.
Paris is not a city of 10 millions, it has only two million habitants. And cars are not reserved to rich people, why would they be? I grew up in Paris, my parents weren't rich, we were living in public housing and we had a car.
Families are not second-tier citizens, and currently the public transports are not suited for them. On top of the other problems, such as the pleasure of having to deal with crackheads and various homeless people in the metro when you have a baby.
When it comes to traffic and urban planning, Paris is best understood as a city of 10+ million people. The administrative subdivision called Paris has only ~2 million people, but the city doesn't end at its borders.
Yes, however there is little urban planning for whole metro, and the administrative level we are talking about here is the intra-muros one. When the mayor decided to reduce the speed on the outer loop, she didn't notify nor discussed with the rest of the metro, for instance. And the measures discussed in the article are specific to Paris.
> Yes, however there is little urban planning for whole metro
Paris biggest infrastructure project for the past 20 years is called "Grand Paris" and revolve entirely around the whole metro. Actually there is literally no urban planning not involving the whole metro. And yes, lowering the speed limit involved multiple consultations with the prefect and the region because it impacts the whole metro.
Considering Paris without its metropole doesn’t make sense. Paris intra-muros is ridiculously small, one eightieth of London, 80% of San Francisco.
You can consider as much as you want, it is not unified. The result is that Paris has an anti-car policy, but the neighboring towns are very pro-car, creating a system where Parisians can't own one, but have to bear their neighbor's who use them to get into the city.
The article you quote says the opposite of what you pretend. I invite you to read the paragraph "Une décision sous le contrôle de l’Etat, rappelle le ministère". It explains in details than the mayor could do nothing without the state agreeing to it.
> You can consider as much as you want, it is not unified.
It is unified. The city and its suburb are one economic unit. Where do you think all the service workers live? You have to be an extremely narrow minded inner city dweller to fail to see this.
> creating a system where Parisians can't own one, but have to bear their neighbor's who use them to get into the city.
I kindly invite you to check the average salary of Paris inhabitants vs the one in the suburbs then take a minute to think about what you just wrote.
Sitch App | Engineer | NYC-area HYBRID (2 days/week in Brooklyn) | Full-time | Python + Django + Flutter + Rust
Sitch is a new type of concierge dating app. We're using AI + Humans to ask you questions, get to know you, then help you find love. We're a concierge matchmaker but we don't call ourselves that and we don't charge the obscene prices that matchmakers charge.
One of our metrics is how little time you spend in the app. We're explicitly opposed to the dopamine-fueled mini-game doomswiping of 'traditional' dating apps which cynically monetizes desperation. We're monetizing intent!
* We're top-tier VC funded, small team of 2 cofounders (Snapchat, Bumble), a CTO from Snap/Clubhouse, and 3 engineers, based in NYC.
* Stack is Django, OpenAI, Flutter, Rust
* Looking for a Mid-level Engineer who is an all-arounder. Backend more important than front-end code, but if can also work on client code (or want to learn) that would be ideal.
* 1-2 days/week in person, in Brooklyn, and the rest remote. We strongly prefer NYC area candidates.
This back and forth will take quite a while, but the resulting implementation plan will be 10x better than the original.
You can automate this by giving Codex a goal, and a skill to call Claude to review the implementation spec until they both agree it's done.
Then, for critical code, have them both implement the spec in a worktree, then BOTH critique each other's implementation.
More often than not, Claude will say to take 2 or 3 pieces from it's design over to Codex, but ship the Codex implementation.
reply