Hacker Newsnew | past | comments | ask | show | jobs | submit | tobyhinloopen's commentslogin

Way too expensive, I'll wait for a free/open source browser optimized to be used by agents.

Our approach is actually very cost-effective compared to alternatives. Our browser uses a token-efficient LLM-friendly representation of the webpage that keeps context size low, while also allowing small and efficient models to handle the low-level navigation. This means agents like Claude can work at a higher abstraction level rather than burning tokens on every click and scroll, which would be far more expensive

If a potential user says it is too expensive, better to ask why than to tell them they are wrong. You likely have assumptions you have not validated

Definitely! Making Smooth as cost-effective as possible it's been a core goal for us, so we'd really love to hear your thoughts on this

We'll continue to make Smooth more affordable and accessible as this is a core principle of our work (https://www.smooth.sh/images/comparison.gif)


are your evals / comparisons publicly/3rd party reproducible?

If it's "trust me, I did a fair comparison", that's not going to fly today. There's too much lying in society, trusting people trying to sell you something to be telling the truth is not the default anymore, skepticism is


That's a great point, we'll publish everything on our docs as soon as possible

Same! If I put the skill's instructions in the general AGENTS.md, it works just fine.

ln -s to the rescue!

That doesn't work very well if your developers are on Windows (and most are). Uneven Git support for symbolic links across platforms is going to end up causing more problems than it solves.

Win developers aren't using WSL?

It's why I wrapped my tiny skills repo with a script that softlink them into whichever is your skills folder, defaulting to Claude, but could be any other.

I treat my skills the same as I would write tiny bash scripts and fish functions in the days gone to simplify my life by writing 2 words instead of 2 sentences. Tiny improvement that only makes sense for a programmer at heart.

[1] https://github.com/flurdy/agent-skills


The root cause should be fixed.

Why not hardlinks?

You can't hardlink a directory.

I had to read it twice as well, I was so confused hah. I’m still confused


They probably organize individual accounts the same as organization accounts for larger groups of users at the same company internally since it all rolls up to one billing. That's my first pass guess at least.


So you were generating and evaluating the performance of your CLAUDE.md files? And you got banned for it?


I think it's more likely that their account was disabled for other reasons, but they blamed the last thing they were doing before the account was closed.


And why wouldn't you? It's the only information available to you.


It reads like he had a circular prompt process running, where multiple instances of Claude were solving problems, feeding results to each other, and possibly updating each other's control files?


They were trying to optimize a CLAUDE.md file which belonged to a project template. The outer Claude instance iterated on the file. To test the result, the human in the loop instantiated a new project from the template, launched an inner Claude instance along with the new project, assessed whether inner Claude worked as expected with the CLAUDE.md in the freshly generated project. They then gave the feedback back to outer Claude.

So, no circular prompt feeding at all. Just a normal iterate-test-repeat loop that happened to involve two agents.


What would be bad in that?

Writing the best possible specs for these agents seems the most productive goal they could achieve.


I think the idea is fine, but what might end up happening is that one agent gets unhinged and "asks" another agent to do more and more crazy stuff, and they get in a loop where everything gets flagged. Remember that "bots configured to add a book at +0.01$ on amazon, reached 1M$ for the book" a while ago. Kinda like that, but with prompts.


I still don't get it, get your models better for this far fetched case, don't ban users for a legitimate use case.


Nothing necessarily or obviously bad about it, just trying to think through what went wrong.


Could anyone explain to me what the problem is with this? I thought I was fairly up to date on these things, but this was a surprise to me. I see the sibling comment getting downvoted but I promise I'm asking this in good faith, even if it might seem like a silly question (?) for some reason.


From what I'm reading in other comments, the problem was Claude1 got increasingly "frustrated" with Claude2's inability to do whatever the human was asking, and started breaking it's own rules (using ALL CAPS).

Sort of like MS's old chatbot that turned into a Nazi overnight, but this time with one agent simply getting tired of the other agent's lack of progress (for some definition of progress - I'm still not entirely sure what the author was feeding into Claude1 alongside errors from Claude2).


How about running Claude as a different user with very limited permissions?


This breaks the non-interactive mode the post want to achieve. Claude will not be able to install some things and will require user action, which is not desired here.


Like what? It can already use npm/pip/etc. And if it needs a new APT package or config in /etc/ then you would want to know because you need to document it.


Claude Code on NixOS feels like it has super powers. Being able to spin up a nix-shell with needed dependencies on demand gives it access to all sorts of tools I don't have or want installed on my base system. My "book-recommendation" claude code uses sqlite to manage my reading history and to-read and maybe-read lists but I never installed tools for sqlite and they aren't present on my NixOS desktop. It just launches a nix-shell with sqlite anytime it needs to read/modify the database. As long as the database file is within the directory claude code was launched from, it doesn't need to prompt for permission. With the caching that NixOS does, it's fast enough to not even think about.


If you make claude work with c/c++, it may need apt for libraries or build tools.

Even with npm/pip, these may not be available on a base linux box.

Even then, some complex projects may need other tools that are not part of a base system (command line tools, redis, ...).


I tried this approach for a while, but I really wanted it to be able to do anything (install system packages, build/run Docker containers, the works).

With these powers there's a lot less back-and-forth with me running commands, copying the output, pasting it to Claude, etc.

I'm sure you've had the case where you had to instruct someone to do something (e.g. playing tech support with family, helping another engineer, etc). While it helps the other person learn, it feels soooo slow vs just doing it yourself :) And since I don't have to teach the agent, I think this approach makes sense.


I run it with sudo enabled - true story

just give it its own machine and let it check out any code

I PXE boot it from a known image when I feel the need


Running it remotely on a VM seems like a very sensible option. Just don't give it permission to nuke the remote repository hah (EG don't allow force-push, use protected branches, only allow write access to branches it created)


Same solution here - keep a base diskless image on the server, copy it to the diskless area, pxeboot the machine. Works for Windows too (iscsi).

Could do the same thing on EC2 of course.


Is this developed by these 10x developers I've heard about?


EU automaters fail at making modern cars. They just put a bunch of screens in there with awful software. If you go all screens, just commit like Tesla. If you can't beat Tesla, just stick with minimal screens and use buttons.

Somewhere between 2010 and 2020, most automakers went crazy with their designs and it went all downhill from there.


I have a 2020 Fiat 500 Abarth, and it is absolutely perfect: There is a screen (I think 7") for Android Auto/CarPlay/radio/nav, and every single other function in the car has a physical button. It is also absolutely gorgeous - pinnacle of design, IMO


That's about what I want from interior - any builtin infotainment will get out of date, any more electronics is just stuff to eventually break


Our 2021 Volkswagen e-Up is like this. There is a tiny(like 3" tiny) screen for the radio, bluetooth and reverse camera, everything else is analogue and has physical buttons. It's honestly best of the best Volkswagen design, what they did with their newer cars in terms of interior usability is a travesty.


They fell for Tesla-fication ... and are only now waking up to the mistake.


i still miss the interior of my 2010 fiat punto


From this year all EU cars will have physical buttons for heater controls, media etc.


Not sure why would you think EU automakers fail at making modern cars, also, you're generalizing 40+ car automakers in one basket.


Poor family members though


I recently read a book of interviews with people who escaped from North Korea, and what shocked me was the discovery that the relatives of those who escaped are often executed (publicly) and that even children are executed in North Korea. We live in a terrible world. I mean... you expect a book from North Korea to contain terrible things, but somehow it was even worse than I expected.


Left wing thought doesn't contain any philosophy of limitations on state power, so under a left wing regime there is no limit to what it might do. No matter how terrible something is, if it can be imagined they will consider implementing it. To avoid that outcome there has to be an understanding of the flawed nature of government, and from that an ideological commitment to a state limited in power and role.


So everyone not a libertarian is dangerous?


Only if they get into power? I mean what's your reading of the last few centuries of history?


[flagged]


We're talking about North Korea executing children because their family members escaped. Doesn't get more left wing than North Korea. Not a good look to deploy a stock reply without thinking.


North Korea is an extremely class based society. About as far away from left as it gets.


I'm not sure you understand the situation here. Everything I don't like is left-wing.


[flagged]


> Everything I don't like is left-wing.


Your sarcasm isn’t obvious enough


A great start is to have LLMs use special UNIX users that can’t do anything except that you allowed them to do, including accessing the database with a read only user.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: