Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Look man, and I'm saying this not to you but to everyone who is in this boat; you've got to understand that after a while, the novelty wears off. We get it. It's miraculous that some gigabytes of matrices can possibly interpret and generate text, images, and sound. It's fascinating, it really is. Sometimes, it's borderline terrifying.

But, if you spend too much time fawning over how impressive these things are, you might forget that something being impressive doesn't translate into something being useful.

Well, are they useful? ... Yeah, of course LLMs are useful, but we need to remain somewhat grounded in reality. How useful are LLMs? Well, they can dump out a boilerplate React frontend to a CRUD API, so I can imagine it could very well be harmful to a lot of software jobs, but I hope it doesn't bruise too many egos to point out that dumping out yet another UI that does the same thing we've done 1,000,000 times before isn't exactly novel. So it's useful for some software engineering tasks. Can it debug a complex crash? So far I'm around zero for ten and believe me, I'm trying. From Claude 3.7 to Gemini 2.5, Cursor to Claude Code, it's really hard to get these things to work through a problem the way anyone above the junior dev level can. Almost unilaterally, they just keep digging themselves deeper until they eventually give up and try to null out the code so that the buggy code path doesn't execute.

So when Sabine says they're useless for interpreting scientific publications, I have zero trouble believing that. Scoring high on some shitty benchmarks whose solutions are in the training set is not akin to generalized knowledge. And these huge context windows sound impressive, but dump a moderately large document into them and it's often a challenge to get them to actually pay attention to the details that matter. The best shot you have by far is if the document you need it to reference definitely was already in the training data.

It is very cool and even useful to some degree what LLMs can do, but just scoring a few more points on some benchmarks is simply not going to fix the problems current AI architecture has. There is only one Internet, and we literally lit it on fire to try to make these models score a few more points. The sooner the market catches up to the fact that they ran out of Internet to scrape and we're still nowhere near the singularity, the better.



100% this. I think we should start producing independent evaluations of these tools for their usefulness, not for whatever made up or convoluted evaluation index the OpenAI, Google or Anthropic throw at us.


> the novelty wears off.

Hardly. I pretty much have been using LLM at least weekly (most of the time daily) since GPT3.5. I am still amazed. It's really, really hard to not be bullish for me.

It kinda reminds me the days I learned Unix-like command line. At least once a week, I shouted to me self: "What? There is a one-liner that does that? People use awk/sed/xargs this way??" That's how I feel about LLM so far.


I tried LLMs for generating shell snippets. Mixed bag for me. It seems to have a hard time making portable awk/sed commands. It also really overcomplicates things; you really don't need to break out awk for most simple file renaming tasks. Lesser used utilities, all bets are off.

Yesterday Gemini 2.5 Pro suggested running "ps aux | grep filename.exe" to find a Wine process (pgrep is the much better way to go for that, but it's still wrong here) and get the PID, then pass that into "winedbg --attach" which is wrong in two different ways, because there is no --attach argument and the PID you pass into winedbg needs to be the Win32 one not the UNIX one. Not an impressive showing. (I already knew how to do all of this, but I was curious if it had any insights I didn't.)

For people with less experience I can see how getting e.g. tailored FFmpeg commands generated is immensely useful. On the other hand, I spent a decent amount of effort learning how to use a lot of these tools and for most of the ways I use them it would be horrific overkill to ask an LLM for something that I don't even need to look anything up to write myself.

Will people in the future simply not learn to write CLI commands? Very possible. However, I've come to a different, related conclusion: I think that these areas where LLMs really succeed in are examples of areas where we're doing a lot of needless work and requiring too much arcane knowledge. This counts for CLI usage and web development for sure. What we actually want to do should be significantly less complex to do. The LLM actually sort of solves this problem to the extent that it works, but it's a horrible kludge solution. Literally converting video files and performing basic operations on them should not require Googling reference material and Q&A websites for fifteen minutes. We've built a vastly overly complicated computing environment and there is a real chance that the primary user of many of the interfaces will eventually not even be humans. If the interface for the computer becomes the LLM, it's mostly going to be wasted if we keep using the same crappy underlying interfaces that got us into the "how do I extract tar file" problem in the first place.


> dumping out yet another UI that does the same thing we've done 1,000,000 times before isn't exactly novel

As a yet that's exactly what people get paid to do every day. And if it saves them time, they won't exactly get bored of that feature.


They really don’t. People say this all the time, but you give any project a little time and it evolves into a special unique snowflake every single time.

That’s why every low code solution and boilerplate generator for the last 30 years failed to deliver on the promises they made.


I agree some will evolve into more, but lots of them won't. That's why shopify, WordPress and others exist - most commercial websites are just online business cards or small shops. Designers and devs are hired to work on them all the time.


If you’re hiring a dev to work on your Shopify site, it’s most likely because you want to do something non-standard. By the time the dev gets done with it, it will be a special unique snowflake.

If your site has users, it will evolve. I’ve seen users take what was a simple trucking job posting form and repurpose an unused “trailer type” field to track the status of the job req.

Every single app that starts out as a low code/no code solution given enough time and users will evolve beyond that low code solution. They may keep using it, but they’ll move beyond being able to maintain it exclusively through a low code interface.


And most software engineering principles is for dealing how to deal with this evolution.

- Architecture (making it easy to adjust part of the codebase and understanding it)

- Testing (making sure the current version works and future version won't break it)

- Requirements (describing the current version and the planned changes)

- ...

If a project was just a clone, I'd sure people would just buy the existing version and be done with it. And sometimes they do, then a unique requirement comes and the whole process comes back into play.


If your website is so basic that you can just take a template and put your specific details into it, what exactly do you need an LLM for?


If your job can be hallowed out into >90% entering prompts into AI text editors, you won't have to worry about continuing to be paid to do it every day for very long.


> Well, are they useful? ... Yeah, of course LLMs are useful, but we need to remain somewhat grounded in reality. How useful are LLMs?

They are useful enough that they can passably replace (much more expensive) humans in a lot of noncritical jobs, thus being a tangible tool for securing enterprise bottom lines.


Which jobs? I haven't seen LLMs successfully replace more expensive humans in noncritical roles


From what I've seen in my own job and observing what my wife does (she's been working with the things on very LLM-centric processes and products in a variety of roles for about three years) not a lot of people are able to use them to even get a small productivity boost. Anyone less than very-capable trying to use them just makes a ton more work for someone more expensive than they are.

They're still useful, but they're not going to make cheap employees wildly more productive, and outside maybe a rare, perfect niche, they're not going to increase expensive employees' productivity so much that you can lay off a bunch of the cheap ones. Like, they're not even close to that, and haven't really been getting much closer despite improvements.


>they can dump out a boilerplate react frontend to a CRUD API

This is so clearly biased that it boarders on parody. You can only get out what you put in. The real use case of current LLMs is that any project that would previously require collaboration can now be down solo with a much faster turnover. Of course in 20 years when compute finally catches up they will just be super intelligent AGI


Complete hyperbole.


I have Cursor running on my machine right now. I am even paying for it. This is in part because no matter what happens, people keep professing, basically every single time a new model is released, that it has finally happened: programmers are finally obsolete.

Despite the ridiculous hype, though, I have found that these things have crossed into usefulness. I imagine for people with less experience, these tools are a godsend, enabling them to do things they definitely couldn't do on their own before. Cool.

Beyond that? I definitely struggle to find things I can do with these tools that I couldn't do better without. The main advantage so far is that these tools can do these things very fast and relatively cheaply. Personally, I would love to have a tool that I can describe what I want in detailed but plain English and have it be done. It would probably ruin my career, but it would be amazing for building software. It'd be like having an army of developers on your desktop computer.

But, alas, a lot of the cool shit I'd love to do with LLMs doesn't seem to pan out. They're really good at TypeScript and web stuff, but their proficiency definitely tapers off as you veer out. It seems to work best when you can find tasks that basically amount to translation, like converting between programming languages in a fuzzy way (e.g. trying to translate idioms). What's troubling me the most is that they can generate shitloads of code but basically can't really debug the code they write beyond the most entry-level problem-solving. Reverse engineering also seems like an amazing use case, but the implementations I've seen so far definitely are not scratching the itch.

> Of course in 20 years when compute finally catches up they will just be super intelligent AGI

I am betting against this. Not the "20 years" part, it could be months for all we know; but the "compute finally catches up" part. Our brains don't burn kilowatts of power to do what they do, yet given basically unbounded time and compute, current AI architectures are simply unable to do things that humans can, and there aren't many benchmarks that are demonstrating how absolutely cataclysmically wide the gap is.

I'm certain there's nothing magical about the meat brain, as much as that is existentially challenging. I'm not sure that this follows through to the idea that you could replicate it on a cluster of graphics cards, but I'm also not personally betting against that idea, either. On the other hand, getting the absurd results we have gotten out of AI models today didn't involve modest increases. It involved explosive investment in every dimension. You can only explode those dimensions out so far before you start to run up against the limitations of... well, physics.

Maybe understanding what LLMs are fundamentally doing to replicate what looks to us like intelligence will help us understand the true nature of the brain or of human intelligence, hell if I know, but what I feel most strongly about is this: I do not believe LLMs are replicating some portion of human intelligence. They are very obviously neither a subset or superset or particularly close to either. They are some weird entity that overlaps in other ways we don't fully comprehend yet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: