More

measurablefunc · 2026-02-16T01:21:36 1771204896

Who handles the liability when the AI makes a catastrophic error in your diagnosis?

measurablefunc · 2026-02-15T21:08:42 1771189722

You're confusing yourself w/ fancy words like "proof space". The LLM is not doing any kind of traversal in any meaningful sense of the word b/c the "proof" is often just grammatically coherent gibberish whereas an actual traversal in an actual space of proofs would never land on incorrect proofs.

joshuaissac · 2026-02-15T21:50:12 1771192212

My reading of their comment is that a proof space is a concept where a human guesses that a proof of some form q exists, and the AI searches a space S(q) where most points may be not valid proofs, but if there is a valid proof, it will hopefully be found.

So it is not a space of proofs in the sense that everything in a vector space is a vector. More like a space of sequences of statements, which have some particular pattern, and one of which might be a proof.

measurablefunc · 2026-02-15T21:59:14 1771192754

So it's not a proof space then. It's some computable graph where the edges are defined by standard autoregressive LLM single step execution & some of the vertices can be interpreted by theorem provers like Lean, Agda, Isabelle/HOL, Rocq, etc. That's still not any kind of space of proofs. Actually specifying the real logic of what is going on is much less confusing & does not lead readers astray w/ vague terms like proof spaces.

measurablefunc · 2026-02-15T18:59:23 1771181963

I still don't get how achieving 96% on some benchmark means it's a super genius but that last 4% is somehow still out of reach. The people who constantly compare robots to people should really ponder how a person who manages to achieve 90% on some advanced math benchmark still misses that last 10% somehow.

bee_rider · 2026-02-15T20:20:34 1771186834

This feels like a maybe interesting position, but I don’t really follow what you mean. Is it possible to just state it directly? Asking us to ponder is sort of vague.

These math LLMs seem very different from humans. A person has a specialty. A LLM that was as skilled as, say, a middling PhD recipient (not superhuman), but also was that skilled in literally every field, maybe somebody could argue that’s superhuman (“smarter” than any one human). By this standard a room full of people or an academic journal could also be seen as superhuman. Which is not unreasonable, communication is our superpower.

sdenton4 · 2026-02-15T20:57:41 1771189061

Yeah - it's interesting where the edge is. In theory, an llm trained in everything should be more ready to make cross-field connections. But doing that well requires certain kind of translation and problem selection work which is hard even for humans. (I would even say, beyond PhD level - knowing which problem is with throwing PhD students at is the domain of professors... And many of them are bad at it, as well.)

On the human side, mathematical silos reduce our ability to notice opportunities for cross-silo applications. There should be lots of opportunity available.

botusaurus · 2026-02-15T19:46:04 1771184764

do you think Terence Tao can solve any math problem in the world that is solvable by another matematician?

Joel_Mckay · 2026-02-15T20:33:43 1771187623

Humans have heuristic biases, and intuition often doesn't succeed with the unknown.

https://en.wikipedia.org/wiki/List_of_cognitive_biases

LLM are good at search, but plagiarism is not "AI".

Leonhard Euler discovered many things by simply trying proofs everyone knew was impossible at the time. Additionally, folks like Isaac Newton and Gottfried Leibniz simply invented new approaches to solve general problems.

The folks that assume LLM are "AI"... also are biased to turn a blind eye to clear isomorphic plagiarism in the models. Note too, LLM activation capping only reduces aberrant offshoots from the expected reasoning models behavioral vector (it can never be trusted.) Thus, will spew nonsense when faced with some unknown domain search space.

Most exams do not have ambiguous or unknown contexts in the answer key, and a machine should score 100% matching documented solutions without fail. However, LLM would also require >75% of our galaxy energy output to reach 1 human level intelligence error rates in general.

YC has too many true believers with "AI" hype, and it is really disturbing. =3

https://www.youtube.com/watch?v=X6WHBO_Qc-Q

botusaurus · 2026-02-15T21:30:40 1771191040

> However, LLM would also require >75% of our galaxy energy output to reach 1 human level intelligence error rates in general.

citation needed

Joel_Mckay · 2026-02-15T21:43:28 1771191808

The activation capping effect on LLM behavior is available in this paper:

https://www.anthropic.com/research/assistant-axis

The estimated energy consumption versus error rate is likely projected from agent test and hidden-agent coverage.

You are correct, in that such a big number likely includes large errors itself given models change daily. =3

botusaurus · 2026-02-15T22:02:02 1771192922

ok, your quote was over generalized, you meant "current LLM need..." and not "any conceivable LLM"

although the word "energy" does not appear on that page, not sure where you get the galaxy energy consumption from

Joel_Mckay · 2026-02-16T00:03:53 1771200233

In general, "any conceivable LLM" was the metric based on current energy usage trends within the known data-centers peak loads (likely much higher due to municipal NDA.) A straw-man argument on whether it is asymptotic or not is irrelevant with numbers that large. For example, 75% of a our galaxy energy output... now only needing 40% total output... does not correct a core model design problem.

LLM are not "AI", and unlikely ever will be due to that cost... but Neuromorphic computing is a more interesting area of study. =3

whattheheckheck · 2026-02-15T21:08:17 1771189697

Humans also spew nonsense when faced with some unknown domain search space

Joel_Mckay · 2026-02-15T21:36:23 1771191383

Indeed, the list of human cognitive biases was posted above.

The activation capping effect on LLM behavior is available in this paper:

https://www.anthropic.com/research/assistant-axis

This data should already have been added to the isomorphic plagiarism machine models.

Some seem to want to bury this thread, but I think you are hilarious. =3

measurablefunc · 2026-02-13T06:15:57 1770963357

Bostrom is very good at theorycrafting.

measurablefunc · 2026-02-12T21:20:25 1770931225

They target those ads by ingesting as many signals as possible from as many input devices & sensors as they can possibly convince people to use. They make a lot of money from advertising b/c they have managed to convince the most number of people to give them as many behavioral signals as possible & they will continue to do so. They kill products only when the signal is not valuable enough to improve their advertising business but that's clearly not the case w/ AI.

measurablefunc · 2026-02-12T07:12:23 1770880343

It's more intellectually lazy to think boolean logic at a sufficient scale crosses some event horizon wherein its execution on mechanical gadgets called computers somehow adds up to intelligence beyond human understanding.

hodgehog11 · 2026-02-12T11:48:22 1770896902

It is intellectually lazy to proclaim something to be impossible in the absence of evidence or proof. In the case of the statement made here, it is provably true that Boolean logic at sufficient scale can replicate "intelligence" of any arbitrary degree. It is also easy to show that this can be perceived as an "event horizon" since the measurements of model quality that humans typically like to use are so nonlinear that they are virtually step function-like.

measurablefunc · 2026-02-12T12:50:28 1770900628

Doesn't seem like you have proof of anything but it does appear that you have something that is very much like religious faith in an unforeseeable inevitability. Which is fine as far as religion is concerned but it's better to not pretend it's anything other than blind faith.

But if you really do have concrete proof of something then you'll have to spell it out better & explain how exactly it adds up to intelligence of such magnitude & scope that no one can make sense of it.

hodgehog11 · 2026-02-12T14:14:39 1770905679

> "religious faith in an unforeseeable inevitability"

For reference, I work in academia, and my job is to find theoretical limitations of neural nets. If there was so much of a modicum of evidence to support the argument that "intelligence" cannot arise from sufficiently large systems, my colleagues and I would be utterly delighted and would be all over it.

Here are a couple of standard elements without getting into details:

1. Any "intelligent" agent can be modelled as a random map from environmental input to actions.

2. Any random map can be suitably well-approximated by a generative transformer. This is the universal approximation theorem. Universal approximation does not mean that models of a given class can be trained using data to achieve an arbitrary level of accuracy, however...

3. The neural scaling laws (first empirical, now more theoretically established under NTK-type assumptions), as a refinement of the double descent curve, assert that a neural network class can get arbitrarily close to an "entropy level" given sufficient scale. This theoretical level is so much smaller than any performance metric that humans can reach. Whether "sufficiently large" is outside of the range that is physically possible is a much longer discussion, but bets are that human levels are not out of reach (I don't like this, to be clear).

4. The nonlinearity of accuracy metrics comes from the fact that they are constructed from the intersection of a large number of weakly independent events. Think the CDF of a Beta random variable with parameters tending to infinity.

Look, I understand the scepticism, but from where I am, reality isn't leaning that way at the moment. I can't afford to think it isn't possible. I don't think you should either.

measurablefunc · 2026-02-12T14:44:50 1770907490

As I said previously, you are welcome to believe whatever you find most profitable for your circumstances but I don't find your heuristics convincing. If you do come up or stumble upon a concrete constructive proof that 100 trillion transistors in some suitable configuration will be sufficiently complex to be past the aforementioned event horizon then I'll admit your faith was not misplaced & I will reevaluate my reasons for remaining skeptical of Boolean arithmetic adding up to an incomprehensible kind of intelligence beyond anyone's understanding.

hodgehog11 · 2026-02-12T23:37:57 1770939477

Which part was heuristic? This format doesn't lend itself to providing proofs, it isn't exactly a LaTeX environment. Also why does the proof need to be constructive? That seems like an arbitrarily high bar to me. It suggests that you are not even remotely open to the possibility of evidence either.

I also don't think you understand my point of view, and you mistake me for a grifter. Keeping the possibility open is not profitable for me, and it would be much more beneficial to believe what you do.

measurablefunc · 2026-02-13T00:06:44 1770941204

I didn't think you were a grifter but you only presented heuristics so if you have formal references then you can share them & people can decide on their own what to believe based on the evidence presented.

hodgehog11 · 2026-02-13T07:16:12 1770966972

Fine, that's fair. I believe the statement that you made is countered by my claim, which is:

Theorem. For any tolerance epsilon > 0, there exists a transformer neural network of sufficient size that follows, up to the factor epsilon, the policy that most optimally achieves arbitrary goals in arbitrary stochastic environments.

Proof (sketch). For any stochastic environment with a given goal, there exists a model that maximizes expected return under this goal (not necessarily unique, but it exists). From Solomonoff's convergence theorem (Theorem 3.19 in [1]), Bayes-optimal predictors under the universal Kolmogorov prior converge with increasing context to this model. Consequently, there exists an agent (called the AIXI agent) that is Pareto-optimal for arbitrary goals (Theorem 5.23 in [1]). This agent is a sequence-to-sequence map with some mild regularity, and satisfies the conditions of Theorem 3 in [2]. From this universal approximation theorem (itself proven in Appendices B and C in [2]), there exists a transformer neural network of a sufficient size that replicates the AIXI agent up to the factor epsilon.

This is effectively the argument made in [3], although I'm not fond of their presentation. Now, practitioners still cry foul because existence doesn't guarantee a procedure to find this particular architecture (this is the constructive bit). This is where the neural scaling law comes in. The trick is to work with a linearization of the network, called the neural tangent kernel; it's existence is guaranteed from Theorem 7.2 of [4]. The NTK predictors are also universal and are a subset of the random feature models treated in [5], which derives the neural scaling laws for these models. Extrapolating these laws out as per [6] for specific tasks shows that the "floor" is always below human error rates, but this is still empirical because it works with the ill-defined definition of superintelligence that is "better than humans in all contexts".

[1] Hutter, M. (2005). Universal artificial intelligence: Sequential decisions based on algorithmic probability. Springer Science & Business Media.

[2] https://arxiv.org/abs/1912.10077

[3] https://openreview.net/pdf?id=Vib3KtwoWs

[4] https://arxiv.org/abs/2006.14548

[5] https://arxiv.org/abs/2210.16859

[6] https://arxiv.org/abs/2001.08361

measurablefunc · 2026-02-13T20:12:14 1771013534

How do you reconcile that w/ the fact that optimal probabilistic planning¹ is actually undecidable?

¹https://www.sciencedirect.com/science/article/pii/S000437020...

hodgehog11 · 2026-02-13T22:07:22 1771020442

Good question. It's because we don't need to be completely optimal in practice, only epsilon close to it. Optimality is undecidable, but epsilon close is not, and that's what the claim says that NNs can provide.

measurablefunc · 2026-02-13T22:09:24 1771020564

That doesn't address what I asked. The paper I linked proves undecidability for a much larger class of problems* which includes the case you're talking about of asymptotic optimality. In any case, I am certain you are unfamiliar w/ what I linked b/c I was also unaware of it until recently & was convinced by the standard arguments people use to convince themselves they can solve any & all problems w/ the proper policy optimization algorithm. Moreover, there is also the problem of catastrophic state avoidance even for asymptotically optimal agents: https://arxiv.org/abs/2006.03357v2.

* - Corollary 3.4. For any fixed ε, 0 < ε < 1, the following problem is undecidable: Given is a PFA M for which one of the two cases hold:

(1) the PFA accepts some string with probability greater than 1 − ε, or (2) the PFA accepts no string with probability greater than ε.

Decide whether case (1) holds.

hodgehog11 · 2026-02-14T10:13:13 1771063993

Oh yes, that's one of the more recent papers from Hutter's group!

I don't believe there is a contradiction. AIXI is not computable and optimality is undecidable, this is true. "Asymptotic optimality" refers to behaviour for infinite time horizons. It does not refer to closeness to an optimal agent on a fixed time horizon. Naturally the claim that I made will break down in the infinite regime because the approximation rates do not scale with time well enough to guarantee closeness for all time under any suitable metric. Personally, I'm not interested in infinite time horizons and do not think it is an important criterion for "superintelligence" (we don't live in an infinite time horizon world after all) but that's a matter of philosophy, so feel free to disagree. I was admittedly sloppy with not explicitly stating that time horizons are considered finite, but that just comes from the choice of metric in the universal approximation which I have continued to be vague about. That also covers the Corollary 3.4, which is technically infinite time horizon (if I'm not mistaken) since the length of the string can be arbitrary.

measurablefunc · 2026-02-12T02:38:49 1770863929

The numbers must go up, there is no other way.

measurablefunc · 2026-02-11T22:24:05 1770848645

It's simpler w/ dual vectors. Removes the backward pass entirely.

measurablefunc · 2026-02-10T23:11:21 1770765081

Mining rigs have a finite lifespan & the places that make them in large enough quantities will stop making new ones if a more profitable product line, e.g. AI accelerators, becomes available. I'm sure making mining rigs will remain profitable for a while longer but the memory shortages are making it obvious that most production capacity is now going towards AI data centers & if that trend continues then hashing capacity will continue diminishing b/c the electricity cost & hardware replenishment will outpace mining rewards.

Bitcoin was always a dead end. It might survive for a while longer but its demise is inevitable.

mlrtime · 2026-02-11T11:38:14 1770809894

Are you lumping in all blockchains here or just bitcoin?

Because other networks don't have this problem.

measurablefunc · 2026-02-10T22:19:38 1770761978

Because they encode statistical properties of the training corpus. You might not know why they work but plenty of people know why they work & understand the mechanics of approximating probability distributions w/ parametrized functions to sell it as a panacea for stupidity & the path to an automated & luxurious communist utopia.

kakapo5672 · 2026-02-11T22:28:12 1770848892

My goodness. Please introduce me to this "plenty of people". I'm in the field, and none of them work with me.

But I can tell you that statistics and parametrized functions have absolutely nothing to do with it. You're way out of your depth my friend.

measurablefunc · 2026-02-12T00:44:36 1770857076

Yes, yes, no one understands how anything works. Calculus is magic, derivatives are pixie dust, gradient descent is some kind of alien technology. It's amazing hairless apes have managed to get this far w/ automated boolean algebra handed to us from our long forgotten godly ancestors, so on & so forth.

threethirtytwo · 2026-02-11T03:00:36 1770778836

No this is false. No one understands. Using big words doesn’t change the fact that you cannot explain for any given input output pair how the LLM arrived at the answer.

Every single academic expert who knows what they are talking about can confirm that we do not understand LLMs. We understand atoms and we know the human brain is made 100 percent out of atoms.we may know how atoms interact and bond and how a neuron works but none of this allows us to understand the brain. In the same way we do not understand LLMs.

Characterizing ML as some statistical approximation or best fit curve is just using an analogy to cover up something we don’t understand. Heck the human brain can practically be characterized by the same analogies. We. Do. Not. Understand. LLMs. Stop pretending that you do.

measurablefunc · 2026-02-11T03:11:50 1770779510

I'm not pretending. Unlike you I do not have any issues making sense of function approximation w/ gradient descent. I learned this stuff when I was an undergrad so I understand exactly what's going on. You might be confused but that's a personal problem you should work to rectify by learning the basics.

threethirtytwo · 2026-02-11T03:42:47 1770781367

omfg the hard part of ML is proving back-propagation from first principles and that's not even that hard. Basic calculus and application of the chain rule that's it. Anyone can understand ML, not anyone can understand something like quantum physics.

Anyone can understand the "learning algorithm" but the sheer complexity of the output of the "learning algorithm" is way to high such that we cannot at all characterize even how an LLM arrived at the most basic query.

This isn't just me saying this. ANYONE who knows what they are talking about knows we don't understand LLMs. Geoffrey Hinton: https://www.youtube.com/shorts/zKM-msksXq0. Geoffrey, if you are unaware, is the person who started the whole machine learning craze over a decade ago. The god father of ML.

Understand?

There's no confusion. Just people who don't what they are talking about (you)

measurablefunc · 2026-02-11T03:52:19 1770781939

I don't see how telling me I don't understand anything is going to fix your confusion. If you're confused then take it up w/ the people who keep telling you they don't know how anything works. I have no such problem so I recommend you stop projecting your confusion onto strangers in online forums.

threethirtytwo · 2026-02-11T05:48:53 1770788933

The only thing that needs to be fixed here is your ignorance. Why so hostile? I'm helping you. You don't know what you're talking about and I have rectified that problem by passing the relevant information to you so next time you won't say things like that. You should thank me.

measurablefunc · 2026-02-11T05:59:43 1770789583

I didn't ask for your help so it's probably better for everyone if you spend your time & efforts elsewhere. Good luck.

threethirtytwo · 2026-02-11T06:24:05 1770791045

Well don't ask me to help you then. I read your profile and it has this snippet in there:

"Address the substance of my arguments or just save yourself the keystrokes."

The substance of your argument was complete ignorance about the topic, so I addressed it as you requested.

Please remove that sentence from your profile if that is not what you want. Thank you.

measurablefunc · 2026-02-11T06:36:34 1770791794

I don't see how you interpreted it that way so I recommend you make fewer assumptions about online content instead of asserting your interpretation as the one & only truth. It's generally better to assume as little as possible & ask for clarifications when uncertain.

threethirtytwo · 2026-02-11T07:27:45 1770794865

There is no other interpretation for that other than what I said. If you disagree then that’s a misinterpretation of the English language.

I am addressing the substance of your argument and that substance is lack of knowledge there is zero other angle to interpret it.

measurablefunc · 2026-02-11T07:33:57 1770795237

As I said previously, I don't think this is a productive use of time or effort for anyone involved so I'm dropping out of this thread.

ionwake · 2026-02-11T09:06:13 1770800773

u come across ungrateful to someone who was just trying to help