Hacker Newsnew | past | comments | ask | show | jobs | submit | krystofee's commentslogin

Does anyone know when will possibly arrive 1M context windows to at least MAX x20 subscriptions for claude code? I would even pay x50 if it allowed that. API usage is too expensive.

I don't know when it will be included as part of the subscription in Claude Code, but at least it's a paid add-on in the MAX plan now. That's a decent alternative for situations where the extra space is valuable, especially without having to setup/maintain API billing separately.

Based on their API pricing a 1M context plan should be 2x the price roughly.

My bets are its more the increased hardware demand that they don't want to deal with currently.


Funny that this age was probably written by AI hence the "how to write modern css" is completely irelevant.

ctrl+o ?

I disagree with the "confidence trick" framing completely. My belief in this tech isn't based on marketing hype or someone telling me it's good – it's based on cold reality of what I'm shipping daily. The productivity gains I'm seeing right now are unprecedented. Even a year ago this wouldn't have been possible, it really feels like an inflection point.

I'm seeing legitimate 10x gains because I'm not writing code anymore – I'm thinking about code and reading code. The AI facilitates both. For context: I'm maintaining a well-structured enterprise codebase (100k+ lines Django). The reality is my input is still critically valuable. My insights guide the LLM, my code review is the guardrail. The AI doesn't replace the engineer, it amplifies the intent.

Using Claude Code Opus 4.5 right now and it's insane. I love it. It's like being a writer after Gutenberg invented the printing press rather than the monk copying books by hand before it.


Even assuming all of what you said is true, none of it disproves the arguments in the article. You're talking about the technology, the article is about the marketing of the technology.

The LLM marketing exploits fear and sympathy. It pressures people into urgency. Those things can be shown and have been shown. Whether or not the actual LLM based tools genuinely help you has nothing to do with that.


The point of the article is to paint LLMs as a confidence trick, the keyword being trick. If LLMs do actually deliver very real, tangible benefits then can you say there is really a trick? If a street performer was doing the cup and ball scam, but I actually won and left with more money than I started with then I'd say that's a pretty bad trick!

Of course it is a little more nuanced than this and I would agree that some of the marketing hype around AI is overblown, but I think it is inarguable that AI can provide concrete benefits for many people.


The marketing hype is economy defining at this point, so calling it overblown is an understatement.

Simplifying the hype into 2 threads, the first is that AI is an existential risk and the second is the promise of “reliable intelligence”.

The second is the bugbear, and the analogy I use is factories and assembly lines vs power tools.

LLMs are power tools. They are being hyped as factories of thoughts.

String the right tool calls, agents, and code together and you have an assembly line that manufactures research reports, gives advice, or whatever white collar work you need. No Holidays, HR, work hours, overhead etc.

I personally want everyone who can see why this second analogy does not work, to do their part in disabusing people of this notion.

LLMs are power tools, and impressive ones at that. In the right hands, they can do much. Power tools are wildly useful. But Power tools do not make automatically make someone a carpenter. They don’t ensure you’ve built a house to spec. Nor is a planar saw going to evolve into a robot.

The hype needs to be taken to task, preferably clinically, so that we know what we are working with, and can use them effectively.


> If LLMs do actually deliver very real, tangible benefits then can you say there is really a trick?

Yes, yes you can. As I’ve mentioned elsewhere on this thread:

> When a con man sells you a cheap watch for an high price, what you get is still useful—a watch that tells the time—but you were also still conned, because what you paid for is not what was advertised. You overpaid because you were tricked about what you were buying.

LLMs are being sold as miracle technology that does way more than it actually can.


And at a cost im not sure most fully understand. We've allowed these companies to externalise all the negative outcomes. Now were seeing consumer electronics stock dry up, huge swaths of raw resources used, massive invasions of privacy, all so this one guy can do his corpo job 10x faster? Nah im good.


A huge amount of tech is a confidence trick. Not one aimed at <50 year old crowd but aimed at innumerate and STEM ignorant political leaders.

It's not LLMs they care about, it's datacenter ownership. US political norms empower owners. If you think of a DC as a mega church and remote users the disciple, it makes the desired network effect obvious. That is leveraged to sway Congress and states.

These tech projects are not intended for users. They're designed to gain confidence of politicians, preferential political support.

Gen pop is not the market. DC is.

Most peoples individual data crunching problems can be resolved with a TI graphing calculator.

Big Tech convinced Congress that culture of helpless consumers of their data center outputs is simpler and will lead humanity to a forever growth future!... nevermind they will all be dead, unable to verify.

A con trick that worked great on older, more religious leaning Americans. One that's not working so well on the younger generation who know how these systems work.


But saying it's a confidence trick is saying it's a con. That they're trying to sell someone something that doesn't work. Th op is saying it makes then 10x more productive, so how is that a con?


The marketing says it does more than that. This isn't just a problem unique to LLMs either. We have laws about false advertising for a reason. It's going on all the time. In this case the tech is new so the lines are blurry. But to the technically inclined, it's very obvious where they are. LLMs are artificial, but they are not literally intelligent. Calling them "AI" is a scam. I hope that it's only a matter of time until that definition is clarified and we can stop the bullshit. The longer it goes, the worse it will be when the bubble bursts. Not to be overly dramatic, but economic downturns have real physical consequences. People somewhere will literally starve to death. That number of deaths depends on how well the marketers lied. Better lies lead to bigger bubbles, which when burst lead to more deaths. These are facts. (Just ask ChatGPT, it will surely agree with me, if it's intelligent. ;p)


How does one go about competing at the IMO without "intelligence", exactly? At a minimum it seems we are forced to admit that the machines are smarter than the test authors.


"LLM" as a marketing term seems rational. "Machine learning" also does. We can describe the technology honestly without using a science fiction lexicon. Just because a calculator can do math faster than Isaac Newton doesn't mean it's intelligent. I wouldn't expect it to invent a new way of doing math like Isaac Newton, at least.


Just because a calculator can do math faster than Isaac Newton doesn't mean it's intelligent.

Exactly, and that's the whole point. If you lack genuine mathematical reasoning skill, a calculator won't help you at the IMO. You might as well bring a house plant or a teddy bear.

But if you bring a GPT5-class LLM, you can walk away with a gold medal without having any idea what you're doing.

Consequently, analogies involving calculators are not valid. The burden of proof rests firmly on the shoulders of those who claim that an LLM couldn't invent new mathematical techniques in response to a problem that requires it.

In fact, that appears to have just happened (https://news.ycombinator.com/item?id=46664631), where an out-of-distribution proof for an older problem was found. (Meta: also note the vehement arguments in that thread regarding whether or not someone is using an LLM to post comments. That doesn't happen without intelligence, either.)


That doesn't appear to be what happened. But the marketing sure has a lot of people working quick to presume so.

I would guess it's only a matter of days before that proof, or one very similar, is found in the training data, if that hasn't happened already, just as has been the case every time.

No fundamental change in how the LLM functions has been made that would lead us to expect otherwise.

Similar "discoveries" occurred all the time with the dawn of the internet connecting the dots on a lot of existing knowledge. Many people found that someone had already solved many problems they were working on. We used to be able to search the web, if you can believe that.

The LLMs are bringing that back in a different way. It's functional internet search with an uncanny language model, that sadly obfuscates the underlying data while making guesswork to summarize it (which makes it harder to tell which of its findings are valuable, and which are not).

It's useful for some things, but that's not remotely what intelligence is. It doesn't literally understand.

>* if you bring a GPT5-class LLM, you can walk away with a gold medal without having any idea what you're doing.*

My money won't be betting on your GPT5-class business advice unless you have a really good idea what you're doing.

It requires some (a lot of) intelligence and experience to usefully operate an LLM in virtually every real world scenario. Think about what that implies. (It implies that it's not by itself intelligent.)


You need to read the IMO papers, seriously. Your outlook on what happened there is grossly misinformed. No searching or tool use was involved.

You cannot bluff, trick, or "market" your way through a test like that.


I didn't say anything about cheating. In fact, if it did cheat, that would make for a much stronger argument in your favor.

If scoring highly on an exam implies intelligence then certainly I'm not intelligent and the Super Nintendo from the 90s is more sentient than myself, given I'm terrible at chess.

I personally don't agree with that definition, nor does any dictionary I'm familiar with, nor do any software engineers with whom I'm familiar, nor any LLM specialists, including the forefront developers at OpenAI, xAI, Google, etc. as far as I'm aware.

But for some reason (it's a very obvious reason $$$), marketers, against the engineers' protest, appear to be claiming otherwise.

This is what you're up against and what you'll find the courts, and lawyers, will go by when this comparison comes to a head.

In my opinion, I can't wait for this to happen.

Thrilled to know if I shouldn't wait for that. If you're directly involved with some credible research to the contrary, I would love to hear more.

But IMO, in this case at least, has nothing to do with intelligence. It's performing a search against its own training data, and piecing together a response in line with that data, while including the context of the search term (aka the question). This is run through a series of linear regressions, and a response is produced. There is nothing really groundbreaking here, as best I can tell.


These arguments usually seem to come down to disagreements about definitions, as you suggest. You've talked about what you don't consider evidence of intelligence, but you haven't said anything about the criteria you would apply. What evidence of intelligent reasoning would change your mind?

It is unsupportable to claim that ML researchers at leading labs share your opinion. Since roughly 2022, they understand that they are working with systems capable of reasoning: https://arxiv.org/abs/2205.11916


Based on an English dictionary definition, I would expect an intelligence exhibits understanding, don't you? I would hope people are reading the dictionary before they market a multibillion dollar product set to reach the masses. It seems irresponsible not to.

The article you linked discussed reasoning. That's really cool. But, consider that we can say that a chess game computer opponent is reasoning. It's using a preprogrammed set of instructions to predict out to some number of possible moves ahead, and choosing the most reasonable. A calculator, essentially, it is in fact reasoning. But that doesn't have much to do with intelligence. As we read in the dictionary, intelligence implies understanding, and we certainly can't say that the Chess Masters opponent from the Super Nintendo literally understands me, right?

More to the point, I don't see that any LLM has thus far exhibited remotely any inkling of understanding, nor can it. It's a linear regression calculator. Much like a lot of TI84 graphing calculators running linear algebraic functions on a grand scale. It's impressive that basic math can achieve results across word archives that sound like a person, but it's still not understanding what it outputs, and really, not what it inputs beyond graphing it algebraically either.

It doesn't literally understand. So, it is not literally intelligent, and it will require some huge breakthroughs to change that. I very much doubt that such a discovery will happen in our lifetime.

It might be more likely that the marketers will succeed in revising the dictionary. We've seen often times that if you use words wrong enough, it becomes right. But so far at least, that hasn't happened with this word.


OK, now let's talk about what it means to "understand" something.

Let's say a kid who's not unusually gifted/talented at math somehow ends up at the International Math Olympiad. Smart-enough kid, regularly gets 4.0+ grades in normal high school classes, but today Timmy got on the wrong bus. He does have a great calculator in his backpack -- heck, we'll give him a laptop with Mathematica installed -- so he figures, why not, I'll take the test and see how it goes. Spoiler: he doesn't do so well. He has the tools, but he lacks understanding of how and when to apply them.

At the same time, the kid at the next desk also doesn't understand what's going on. She's a bright kid from a talented family -- in fact Alice's old man works for OpenAI -- but she's a bit absent-minded. Alice not only took the wrong bus this morning, but she grabbed the wrong laptop on the way out the door. She shrugs, types in the problems, and copies down what she sees on the screen. She finishes up, turns in the paper, and they give her a gold medal.

My point: any definition of "understanding" you can provide is worthless unless it can somehow account for the two kids' different experiences. One of them has a calculator that does math, the other has a calculator that understands math.

I very much doubt that such a discovery will happen in our lifetime.

So did I, and then AlphaGo happened, and IMO a few years later. At that point I realized I wasn't very good at predicting what was and was not going to be possible, so I stopped trying.


Calculators do not understand math, while both kids understand each other and the world around them. The calculator relies on an external intelligence.

Don't stop trying. Predictability is an indicator of how well a theory describes the universe. That's what science is.

The engineers have long predicted this stuff. LLM tech isn't really new. The size and speed of the machines is new. The more you understand about a topic, the better your predictions.


The more you understand about a topic, the better your predictions.

Indeed.


I'm not sure what your level of expertise is with software but I got a lot out of some free tutorials on developing your own LLM and on ML. These are even available, free, directly from Google among many other sources.

I feel that my expectations surrounding "AI" are much more realistic than they were before building the tools.

If you haven't already, it's very much worth giving them a run through.


Exactly. It’s like if someone claimed to be selling magical fruit that cures cancer, and they’re just regular apples. Then people like your parent commenter say “that’s not a con, I eat apples and they’re both healthy and tasty”. Yes, apples do have great things about them, but not the exaggerations they were being sold as. Being conned doesn’t mean you get nothing, it means you don’t get what was advertised.


The claims being made that are cited are not really in that camp though..

It may be extremely dangerous to release. True. Even search engines had the potential to be deemed too dangerous in the nuclear pandoras box arguments of modern times. Then there are high-speed phishing opportunities, etc.

It may be an essential failure to miss the boat. True. If calculators were upgraded/produced and disseminated at modern Internet speeds someone who did accounting by hand would have been fired if they refused to learn for a few years.

Its communication builds an unhealthy relationship that is parasitic. True. But the Internet and the way content is critiqued is a source of this even if it is not intentionally added.

I don't like many people involved and I don't think they will be financially successful on merit alone given that anyone can create a LLM. But LLM technology is being sold by organic "con" that is how all technology such as calculators end up spreading for individuals to evaluate and adopt. A technology everyone is primarily brutally honest about is a technology that has died because no one bothers to check if the brutal honesty has anything to do with their own possible uses.


> The claims being made that are cited are not really in that camp though..

They literally are. Sam Altman has literally said multiple times this tech will cure cancer.


Such claims are not cited in the article. It may be possible to write a good article on the topic but this article could just as well be on the organic uptake of the PC and how most wealthy nontechnical people adopted a need for a PC on "cons" that preceded their ability to gain more worth than trouble.


Yeah, but it should have been in the title otherwise it uses in itself a centuries old trick.


> The productivity gains I'm seeing right now are unprecedented.

My company just released a year-long productivity chart covering our shift to Claude Code, and overall, developer productivity has plummeted despite the self-reported productivity survey conveying developers felt it had shot through the roof.


I'd like to see a neutral productivity measure? Whether you tell me it went way up or way down I tend to be suspicious of productivity measures being neutral to perception changes that effect expectation, non paradoxical, etc.


It makes a lot of intuitive sense: people feel more productive because they're twiddling switches but they're spending so much time on tooling it doesn't actually increase output (this is more or less what the MIT study found: 20% perception of productivity, 20% lower actual output).


Sure but increased output would mean code. I don't think generating a lot of code is itself developer productivity. Some people could be using it to stop themselves from creating bad code which is developer productivity. While I find it a bit unlikely people are using it in this way (in terms of the average) I would most certainly have made this argument if code quantity was up from LLMs so I can't claim to know a quantitative measure.


My hypothesis for why our developers have reduced productivity is that LLM assisted coding has made reviews much more difficult. The words that are written are subtly more complex for a human to understand compared to what our engineers would have previously written themselves. Sort of an uncanny valley effect.

Couple that with engineers across the board mentioning that they feel like they're losing proficiency in an understanding of the codebase and where things are.


The model that does make sense to me is (and the only actual success stories I've seen) is people saying "it let me quickly produce a piece of software that otherwise wouldn't have been worth the time to create". That is definitely an increase in productivity, but "software people aren't actually willing to pay for can now be made much more cheaply" is a much different claim than the marketing is making (which I read to be TFA's point).


I don't really see the influence of things like LLMs (or StackOverflow or improved search engines) as simple productivity. People do what they can with very complex value estimates and comfort levels. If they are less productive in a careful measure it may mean they are doing a lot of high value low hanging fruit across areas they were afraid to touch.

The trouble with highly productive specialists is that they produce a ton of high quality results where the demand is not really there and has to be artificially made. Even if you find enough work for them it often means the incremental cases are things you wouldn't have bothered with. A specialist branching to work slowly in related yet further related areas is a lot more value and can work with an oracle so flawed that it barely beats chance..

With juniors it is much more complex, but they have always been a useless consideration in productivity. Not having them has always been highly productive in the short term but has long term consequences.


It's fine for a Django app that doesn't innovate and just follows the same patterns for the 100 solved problems that it solves.

The line becomes a lot blurrier when you work on non trivial issues.

A Django app is not particularly hard software, it's hardly software but a conduit from database to screens and vice-versa; which is basic software since the days of terminals. I'm not judging your job, if you get paid well for doing that, all power to you. I had a well paying Laravel job at some point.

What I'm raising though is the fact that AI is not that useful for applications that aren't solving what has been solved 100 times before. Maybe it will be, some day, reasoning that well that it will anticipate and solve problems that don't exist yet. But it will always be an inference on current problems solved.

Glad to hear you're enjoying it, personally, I enjoy solving problems, not the end result as much.


I think the 'novelty' goalpost is being moved here. This notion that agentic LLMs can't handle novel or non-trivial problems needs to die. They don't merely derive solutions from the training data, but synthesize a solution path based on the context that is being built up in the agentic loop. You could make up some obscure DSL whole cloth, that has therefore never been in the training data, feed it the docs and it will happily use it to create output in said DSL.

Also, almost all problems are composite problems where each part is either prior art or in itself somewhat trivial. If you can onboard the LLM onto the problem domain and help it decompose then it can tackle a whole lot more than what it has seen during pre- and post-training.


> You could make up some obscure DSL whole cloth, that has therefore never been in the training data, feed it the docs and it will happily use it to create output in said DSL.

I have two stories, which I will attempt to tie together coherently in response.

I'm making a compiler right now. ChatGPT 4 was helpful in the early phases. Even back then, its capabilities with reading and modifying the grammar and writing boilerplate for a parser was a real surprise. Today 5.2-Codex is iterating on the implementation and specification as I extend the language and fill in gaps in the compiler.

Don't get me wrong, it isn't a "10x" productivity gain. Not even close. And the model makes decisions that I would not. I spent the last few days completely rewriting the module system that it spit out in an hour. Yeah, it worked, but it's not what I wanted. The downsides are circumstantial.

25 years ago, I was involved in a group whose shared hobby was "ROM hacking". In other words, unofficial modification of classic NES and SNES games. There was a running joke in our group that went something like this: Someone would join IRC and ask for an outlandish feature in some level editor that seemed hopelessly impossible at the time. Like generating a new level with new graphics.

We would extrapolate the request to adding a button labeled "Do My Hack For Me". Good times! Now this feature request seems within reach. It may forever be a pipe dream, who knows. But it went from "unequivocally impossible" to "ya know, with the right foundation and guidance, that might just be crazy enough to work!" Almost entirely all within the last 10 years.

I think the novelty or creativity criticism of AI is missing the point. Using these tools in novel or creative ways is where I would put my money in the coming decade. It is mind boggling that today's models can even appear to make sense of my completely made up language and compiler. But the job postings for adding those "Do My Hack For Me" buttons are the ones to watch for.


I feel as though the majority of programmers do the same thing; they apply well known solutions to business programs. I agree that LLM are not yet making programs like ffmpeg, mpv, or BLAS but only a small amount of programmers are working on projects like that anyway.


> It's like being a writer after Gutenberg invented the printing press rather than the monk copying books by hand before it.

That's not how book printing works and I'd argue the monk can far more easy create new text and devise new interpretations. And they did in the sidelines of books. It takes a long time to prepare one print but nearly just as long as to print 100 which is where the good of the printing press comes from. It's not the ease of changing or making large sums of text, it's the ease of reproducing and since copy/paste exist it is a very poor analogue in my opinion.

I'd also argue the 10x is subject/observer bias since they are the same person. My experience at this point is that boilerplate is fine with LLMs, and if that's only what you do good for you, otherwise it will hardly speed up anything as the code is the easy part.


This. By now I don’t understand how anyone can still argue in the abstract while it’s trivial to simply give it a try and collect cold, hard facts.

It’s like arguing that the piano in the room is out of tune and not bothering to walk over to the piano and hit its keys.


Downside is a lot of those that argue, try out some stuff in ChatGPT or other chat interface without digging a bit further. Expecting "general AI" and asking general questions where LLMs are most prone for hallucinations. Other part is cheap out setups using same subscription for multiple people who get history polluted.

They don't have time to check more stuff as they are busy with their life.

People who did check the stuff don't have time in life to prove to the ones that argue "in exactly whatever the person arguing would find useful way".

Personally like a year ago I was the person who tried out some ChatGPT and didn't have time to dabble, because all the hype was off putting and of course I was finding more important and interesting things to do in my life besides chatting with some silly bot that I can trick easily with trick questions or consider it not useful because it hallucinated something I wanted in a script.

I did take a plunge for really a deep dive into AI around April last year and I saw for my own eyes ... and only that convinced me. Using API where I built my own agent loop, getting details from images, pdf files, iterating on the code, getting unstructured "human" input into structured output I can handle in my programs.

*Data classification is easy for LLM. Data transformation is a bit harder but still great. Creating new data is hard so like answering questions where it has to generate stuff from thin air it will hallucinate like a mad man.*

Data classification like "is it a cat, answer with yes or no" it will be hard for latest models to start hallucinating.


So I tried it and it is worse that having random dude from Fiverr write you code — it is actively malicious and goes out of it's way do decieve and to subtly sabotage existing working code.

Do I now get the right to talk badly about all LLM coding, or is there another exercise I need to take?


Hey, serious question that I ask in good faith: would you be open to a screensharing session, where we compare approaches and experiences?


It's like arguing that the piano goes out of tune randomly and that even if you get through 1, 2, or even 10 songs without that happening, I'm not interested in playing that piano on stage.


I am hitting the keys, and I call bullshit.

Yes, the technology is interesting and useful. No, it is not a “10x” miracle.


I call "AGI" or "100x miracle" a bullshit but still existing stuff definitely is "10x miracle".


This is a known sales trick, called door-in-the face. First you introduce your victim to an outrageous claim, and then follow with a more modest and more reasonable sounding claim.

In truth neither claims are reasonable, but because of the door in the face, the victim is more susceptible the the latter claim. Without the more outrageous claim it is unlikely the victim would have believed the latter claim.

In reality, both "AGI" and "100x miracle" AND the "10x miracle" are all outrageous claims, and I call bullshit on all of them.


I am more concerned by bait and switch that is comming, people will get used to convenience for $100 a year or $100 a month and after 10 years they do price 5x and what are people going to do?


> The productivity gains I'm seeing right now are unprecedented.

How long have you been in the industry?

This does not seem a revolution compared with database standardization, abandonment of assembly for most coding, introduction of game engines, etc.

I see a lot of hype for LLMs from people that do not have the experience to compare them to anything else.


I been doing development for over 25 years and I completely agree with what they are saying. It's similar to going from punch cards to terminals. We were using assembly, COBOL, and Fortran in the 1990's and into the early 2000s. Zork even had a game engine. These are not the revolutions you think and have no hard cut off when change happened.


Are you actually reading the code? I have noticed most of the gains go away when you are reading the code outputted by the machine. And sometimes I do have to fix it by hand and then the agent is like "Oh you changed that file, let me fix it"


> My belief in this tech isn't based on marketing hype or someone telling me it's good – it's based on cold reality of what I'm shipping daily

Then why is half of the big tech companies using Microsoft Teams and sending mails with .docx embedded in ?

Of course marketing matters.

And of course the hard facts also matters, and I don't think anybody is saying that AI agents are purely marketing hype. But regardless, it is still interesting to take a step back and observe what marketing pressures we are subject to.


> I'm seeing legitimate 10x gains...

Self-reports on this have been remarkably unreliable.


0.05x to 0.5x


You are speculating. You don’t know. You are not testing this technology— you are trusting it.

How do I know? Because I am testing it, and I see a lot of problems that you are not mentioning.

I don’t know if you’ve been conned or you are doing the conning. It’s at least one of those.


The best way to describe AI agents (coding agents here) I heard on some presentation, I think it was from Andrej Karpathy.

It was something like this:

"We think we are building Ultron but really we are building the Iron Man suit. It will be a technology to amplify humans, not replace them"



The monk analogy is perfect


And you not only use the emdash once -- you use it twice


> I'm maintaining a well-structured enterprise codebase (100k+ lines Django)

How do you avoid this turning into spaghetti? Do you understand/read all the output?


haha enterprise python/django! that was good


"My belief in this tech isn't based on marketing hype or someone telling me it's good - it's based on cold reality of what I'm shipping daily."

This may be true. The commenter may "believe in this tech" based on his experimentation with it

But the majority of sentences following this statement ironically appear to be "marketing hype" or "someone telling [us] it's good":

1. "The productivity gains I'm seeing right now are unprecedented."

2. "Even a year ago this wouldn't have been possible, it really feels like an inflection point."

3. "I'm seeing legitimate 10x gains because I'm not writing code anymore - I'm thinking about code and reading code."

4. "Using Claude Code Opus 4.5 right now and it's insane."

5. "It's like being a writer after Gutenberg invented the printing press rather than the monk copying books by hand before it."

The "framing" in this blog post is not focused on whether "this tech" actually saves anyone any time or money

It is focused on _hype_, namely how "this tech" is promoted. That promotion could be intentional or unintentional

N.B. I am not "agreeing" with the blog post author or "disagreeing" with the HN commenter, or vice versa. The point I'm making is that one is focused on whether "this tech" works for them and the other is focused on how "this tech" is being promoted. Those are two different things, as other replies have also noted. Additionally, the comment appears to be an example of the promotion (hype) that its author claims is not the basis for his "belief in this tech"

I think the use of the term "belief" is interesting

That term normally implies a lack of personal knowledge:

151 "Belief" gcide "The Collaborative International Dictionary of English v.0.48"

Belief \Be*lief"\, n. [OE. bileafe, bileve; cf. AS. gele['a]fa. See {Believe}.]

1. Assent to a proposition or affirmation, or the acceptance of a fact, opinion, or assertion as real or true, without immediate personal knowledge; reliance upon word or testimony; partial or full assurance without positive knowledge or absolute certainty; persuasion; conviction; confidence; as, belief of a witness; the belief of our senses. [1913 Webster]

Belief admits of all degrees, from the slightest suspicion to the fullest assurance. --Reid. [1913 Webster]

2. (Theol.) A persuasion of the truths of religion; faith. [1913 Webster]

No man can attain [to] belief by the bare contemplation of heaven and earth. --Hooker. [1913 Webster]

4. A tenet, or the body of tenets, held by the advocates of any class of views; doctrine; creed. [1913 Webster]

In the heat of persecution to which Christian belief was subject upon its first promulgation. --Hooker. [1913 Webster]

{Ultimate belief}, a first principle incapable of proof; an intuitive truth; an intuition. --Sir W. Hamilton. [1913 Webster]

Syn: Credence; trust; reliance; assurance; opinion. [1913 Webster]

151 "belief" wn "WordNet (r) 3.0 (2006)"

belief

n 1: any cognitive content held as true [ant: {disbelief}, {unbelief}]

2: a vague idea in which some confidence is placed; "his impression of her was favorable"; "what are your feelings about the crisis?"; "it strengthened my belief in his sincerity"; "I had a feeling that she was lying" [syn: {impression}, {feeling}, {belief}, {notion}, {opinion}]

151 "BELIEF" bouvier "Bouvier's Law Dictionary, Revised 6th Ed (1856)"

BELIEF. The conviction of the mind, arising from evidence received, or from information derived, not from actual perception by our senses, but from. the relation or information of others who have had the means of acquiring actual knowledge of the facts and in whose qualifications for acquiring that knowledge, and retaining it, and afterwards in communicating it, we can place confidence. " Without recurring to the books of metaphysicians' "says Chief Justice Tilghman, 4 Serg. & Rawle, 137, "let any man of plain common sense, examine the operations of, his own mind, he will assuredly find that on different subjects his belief is different. I have a firm belief that, the moon revolves round the earth. I may believe, too, that there are mountains and valleys in the moon; but this belief is not so strong, because the evidence is weaker." Vide 1 Stark. Ev. 41; 2 Pow. Mortg. 555; 1 Ves. 95; 12 Ves. 80; 1 P. A. Browne's R 258; 1 Stark. Ev. 127; Dyer, 53; 2 Hawk. c. 46, s. 167; 3 Wil. 1, s. 427; 2 Bl. R. 881; Leach, 270; 8 Watts, R. 406; 1 Greenl. Ev. Sec. 7-13, a.


Have you seen the 2025 METR report on AI coding productivity?

TLDR: everyone thought AI made people faster, including those who did the task, both before and after doing it. However, AI made people slower at doing the task.


300%


Is there any easy way to implement this pattern in AWS RDS deployments where we need to deploy multiple times a day and need it to be done in few minutes?


In my experience, this process typically spans multiple deploys. I would say the key insight that I have taken away from decades of applying this approach, is that data migrations need to be done in an __eventually consistent__ approach, rather than as an all-or-nothing, stop-the-world, global transaction or transformation.

Indeed, this pattern, in particular, is extremely useful in environments where you are trying to making changes to one part of a system while multiple deploys are happening across the entire system, or where you are dealing with a change that requires a large number of clients to be updated where you don't have direct control of those clients or they operate in a loosely-connected fashion.

So, regardless of AWS RDS as your underlying database technology, plan to break these steps up into individual deployment steps. I have, in fact, done this with systems deployed over AWS RDS, but also with systems deployed to on-prem SQL Server and Oracle, to nosql systems (this is especially helpful in those environments), to IoT and mobile systems, to data warehouse and analysis pipelines, and on and on.


Their docs show throughput limits (e.g., 4 CPU = 60 errors/sec), but what happens during error spikes?

If my app crashes and blasts hundreds of errors in seconds, does Telebugs have built-in rate limiting or backpressure? Or do I need to overprovision hardware/implement throttling myself?

With SaaS tools, spike protection is their problem. With self-hosted, I’m worried about overwhelming my own infrastructure without adding complexity.

Anyone running this in production?


Hey, Telebugs creator here. Great questions! Right now, Telebugs doesn’t have built-in throttling, so during error spikes, you’d either need to handle it manually or overprovision. I do plan to add throttling in the future, similar to what Sentry does, to protect your infrastructure automatically.

Curious: for those running self-hosted error trackers in production, how do you currently handle sudden error spikes? Any clever tricks or patterns you swear by?


The company I work for runs self hosted sentry. Sentry has something that tells you that events are being dropped due to pressure. I think every engineer in the company knows that this is happening but no one fixes it because no one has the time to look into it.


Thanks for your answer! Would you mind sharing your error volume? I’m also curious, how often do dropped events happen, and how does it impact your workflow? Any workarounds you’ve tried, or features you wish were available? This will help me make sure the feature is implemented in a way that’s actually useful.


That seems like insanely low throughput. What takes it so long?


It uses SQLite as its database.


SQLite is really fast. There's no way that's the bottleneck.


What’s your experience with SQLite? It’s a bit hard to talk about performance without sharing code.


I've used it a fair bit. My biggest use was for a computer processing system that recorded gigabytes of data. If it was limited to 60 inserts per second it would have taken months to run!

I do recall having to change some settings to make it really fast, but it wasn't 60/second slow.

See the "update" in this answer.

https://www.sqlite.org/faq.html#q19


Appreciate the answer! You’ve probably worked with raw SQLite drivers. I’m using a framework, which likely runs more transactions by default. I’m fairly confident that with a bit of digging, I can improve the ingestion speed. Good to know and thanks for sharing your experience!


Isnt it huge deal, that this 30B model can compare and surpass huge closed models?


try /context in Claude Code


A very crude tool. A good start maybe, but it does not give us any information about the message part of the context, the one that matters.

We can't really do much with the information that x amount is reserved for MCP, tool calling or the system prompt.


> We can't really do much with the information that x amount is reserved for MCP, tool calling or the system prompt.

I actually think this is pretty useful information. It helps you evaluate whether an MCP server is worth the context cost. Similar for getting a feel for how much context certain tool uses use up. I feel like there's a way you can change the system prompt, and so that helps you evaluate if what you've got there is worth it also.


Sure, it's useful, once.

What we need is a way to manage the dynamic part of the context without just starting from zero each time.


My theory is that you will never get this from a frontier model provider because as is alluded to in sibling thread the context window management is actually a good hunk of the secret sauce that makes these things effective and companies do not want to give that up


I’m experiencing something similar. We have a codebase of about 150k lines of backend code. On one hand, I feel significantly more productive - perhaps 400% more efficient when it comes to actually writing code. I can iterate on the same feature multiple times, refining it until it’s perfect.

However, the challenge has shifted to code review. I now spend the vast majority of my time reading code rather than writing it. You really need to build strong code-reading muscles. My process has become: read, scrap it, rewrite it, read again… and repeat until it’s done. This approach produces good results for me.

The issue is that not everyone has the same discipline to produce well-crafted code when using AI assistance. Many developers are satisfied once the code simply works. Since I review everything manually, I often discover issues that weren’t even mentioned. During reviews, I try to visualize the entire codebase and internalize everything to maintain a comprehensive understanding of the system’s scope.


I'm very surprised you find this workflow more efficient than just writing the code. I find constructing the mental model of the solution and how it fits into existing system and codebase to be 90% of effort, then actually writing the code is 10%. Admittedly, I don't have to write any boilerplate due to the problem domain and tech choices. Coding agents definitely help with the last 10% and also all the adjacent work - one-off scripts where I don't care about code quality.


I doubt it actually is. All the extra effort it takes to make the AI do something useful on non trivial tasks is going to end up being a wash in terms of productivity, if not a net negative. But it feels more productive because of how fast the AI can iterate.

And you get to pay some big corporation for the privilege.


> Many developers are satisfied once the code simply works.

In the general case, the only way to convince oneself that the code truly works is to reason through it, as testing only tests particular data points for particular properties. Hence, “simply works” is more like “appears to work for the cases I tried out”.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: