Hacker Newsnew | past | comments | ask | show | jobs | submit | somebodythere's commentslogin

You personally wouldn't use live captions and dubbing, so there's no point building it for the millions of people who need it as an accessibility feature?

They can use addons, but it shouldn’t be built in to the browser. Not all that complicated.

Because of what?

Why it must be addon? Because Ai has negative connotations?


Bloat? Security? Privacy? Larger codebase to maintain? Lack of focus by a Browseer company? Speed?

Rufus is a Claude Haiku, yes.


I wonder if that sentence will have any discernible meaning 100 years from now.


I've seen a few of this type of thing pop up in search results ("DeepWiki" by Cognition.) I'm not a fan. It is just LLM contentslop, basically. Actual wikis written by humans are made of actual insight from developers and consumers. "We intend you use it in X way", "If you encounter Y issue, do Z." etc. Look at arch wiki. Peak wiki-style documentation, LLMs could never recreate. Well, maybe with a future iteration of the technology they can be useful. But for now, you do not gain much by essentially restating code, API interfaces, and tests in prose. They take up space from legitimate documentation and developer instruction in search results.


True. Arch Wiki is one of the best documentation system I have ever seen, which is also why I always choose Arch-derived OSes.


> LLMs could never recreate. Well, maybe with a future iteration of the technology they can be useful

Releasing a product like DeepWiki is the first step towards creating a data flywheel that yields useful information.


I think this wound up being close enough to true, it's just that it actually says less than what people assumed at the time.

It's basically the Jevons paradox for code. The price of lines of code (in human engineer-hours) has decreased a lot, so there is a bunch of code that is now economically justifiable which wouldn't have been written before. For example, I can prompt several ad-hoc benchmarking scripts in 1-2 minutes to troubleshoot an issue which might have taken 10-20 minutes each by myself, allowing me to investigate many performance angles. Not everything gets committed to source control.

Put another way, at least in my workflow and at my workplace, the volume of code has increased, and most of that increase comes from new code that would not have been written if not for AI, and a smaller portion is code that I would have written before AI but now let the AI write so I can focus on harder tasks. Of course, it's uneven penetration, AI helps more with tasks that are well-described in the training set (webapps, data science, Linux admin...) compared to e.g. issues arising from quirky internal architecture, Rust, etc.


That's ridiculous. Not it isn't even close.


At an individual level, I think it is for some people. Opus/Sonnet 4.5 can tackle pretty much any ticket I throw at it on a system I've worked on for nearly a decade. Struggles quite a bit with design, but I'm shit at that anyway.

It's much faster for me to just start with an agent, and I often don't have to write a line of code. YMMV.

Sonnet 3.7 wasn't quite at this level, but we are now. You still have to know what you're doing mind you and there's a lot of ceremony in tweaking workflows, much like it had been for editors. It's not much different than instructing juniors.


Roughly, this is the Electronic Frontier Foundation (and comparable lobbying orgs in other countries.) However, an org like this doesn't have much power to compel individuals to give them $1.


LLM argumentative essays tend to have this "gish-gallop" energy; say a bunch of tenuously related and vaguely supported things, leave the reader wondering if it was the author who failed to connect the dots, or them


Yes, so do human ones (just not the ones that filter through to you). The output is like this because the training data is like this.


Maybe it's my engineer-brain talking, but "lab-grown" actually biases me towards the diamonds. Feels precise and futuristic.


My wife wanted a sapphire and we met during Ph D research. It's straight up not possible to pay more then like, a dollar for a synthetic sapphire so that's what's in her ring.


If you wanted you could even diy your own sapphires and rubies. It isn't a complicated process, but im sure its finicky to get everything perfect.


I like my scintillator crystals.. they're purpose built to be very fluorescent


where do you buy them at that price?


I don't know if it matters. Even if the best we can do is get really good at interpolating between solutions to cognitive tasks on the data manifold, the only economically useful human labor left asymptotes toward frontier work; work that only a single-digit percentage of people can actually perform.


My guess is that they did RLVR post-training for SWE tasks, and a smaller model can undergo more RL steps for the same amount of computation.


Because if the agent and governor are trained together, the shared reward function will corrupt the governor.


The shared reward function from pre-training is like primary school for an LLM. Maybe RLHF is like secondary school. The governor can be differentiated from the workers with different system and user prompts, fine tuning, etc., which might be similar to medical school or law school for a human.

Certainly human judges, attorneys for defense and prosecution, and members of the jury can still perform their jobs well even if they attended the same primary and secondary schools.


I see what you are getting at. My point is that if you train and agent and verifier/governor together based on rewards from e.g. RLVR, the system (agent + governor) is what will reward hack. OpenAI demonstrated this in their "Learning to Reason with CoT" blog post, where they showed that using a model to detect and punish strings associated with reward hacking in the CoT just led the model to reward hack in ways that were harder to detect. Stacking higher and higher order verifiers maybe buys you time, but also increases false negative rates + reward hacking is a stable attractor for the system.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: