More

brotchie · 2026-03-22T20:34:58 1774211698

OpenClaw is just like any other tool, you need to learn it before its power is available to you.

Just like anything in engineering really: you have to play around source control to understand source control, you have to play around with database indexes to learn how to optimize a database.

Once you've learned it and incorporated it into your tool set, you then have that to wield in solving problems "oh, damn, a database index is perfect for this."

To this end, folks doing flights and scheduling meetings using OpenClaw are really in that exploration / learning phase. They tackle the first (possibly uninventive thing) that comes to mind to just dive in and learn.

The real wins come down the line when you're tackling some business / personal life problem and go: "wait a second, an OpenClaw agent would be perfect for this!"

phist_mcgee · 2026-03-22T22:43:54 1774219434

>The real wins come down the line when you're tackling some business / personal life problem and go: "wait a second, an OpenClaw agent would be perfect for this!"

Such as?

imiric · 2026-03-22T21:41:30 1774215690

> OpenClaw is just like any other tool, you need to learn it before its power is available to you.

That's ridiculous. The utility of any tool is usually knowable before using it. That's how most tools work. I don't need to learn how to drive a car to know what I could use it for. I learn to drive it because I want to benefit from it, not the other way around.

It's the same with computers and any program. I use it to accomplish a specific task, not to discover the tasks it could be useful for.

OpenClaw is yet another tool in search of a problem, like most of the "AI" ecosystem. When the bubble bursts, nobody will remember these tools, and we'll be able to focus on technology that solves problems people actually have.

wyre · 2026-03-22T22:57:28 1774220248

Such a wrong take.

The utility of a program like Excel, Obsidian, Notion, Unity, Jupyter, or Emacs far beyond the knowledge of knowing how to use the product.

All of these products are hammers with nails as far as your creativity will take you.

Its wild to have be on a website called Hacker News, talking about a product that can make a computer do seemingly anything, and insisting its a tool in search of a problem.

brotchie · 2026-03-09T21:23:02 1773091382

Not enough time, too many projects. Useful projects I did over the weekend with Opus 4.6 and GPT 5.4 (just casually chatting with it).

2025 Taxes

Dumped all pdfs of all my tax forms into a single folder, asked Claude the rename them nicely. Ask it to use Gemini 2.5 Flash to extract out all tax-relevant details from all statements / tax forms. Had it put together a webui showing all income, deductions, etc, for the year. Had it estimate my 2025 tax refund / underpay.

Result was amazing. I now actually fully understand the tax position. It broke down all the progressive tax brackets, added notes for all the extra federal and state taxes (i.e. Medicare, CA Mental Health tax, etc).

Finally had Claude prepare all of my docs for upload to my accountant: FinCEN reporting, summary of all docs, etc.

Desk Fabrication

Planning on having a furniture maker fabricate a custom walnut solid desk for a custom office standing desk. Want to create a STEP of the exact cuts / bevels / countersinks / etc to help with fabrication.

Worked with Codex to plan out and then build an interactive in-browser 3D CAD experience. I can ask Codex to add some component (i.e. a grommet) and it will generate a parameterized B-rep geometry for that feature and then allow me to control the parameters live in the web UI.

Codex found Open CASCADE Technology (OCCT) B-rep modeling library, which has a web assembly compiled version, and integrated it.

Now have a WebGL view of the desk, can add various components, change their parameters, and see the impact live in 3D.

cj · 2026-03-09T21:33:14 1773091994

I love the tax use case.

What scares me though is how I've (still) seen ChatGPT make up numbers in some specific scenarios.

I have a ChatGPT project with all of my bloodwork and a bunch of medical info from the past 10 years uploaded. I think it's more context than ChatGPT can handle at once. When I ask it basic things like "Compare how my lipids have trended over the past 2 years" it will sometimes make up numbers for tests, or it will mix up the dates on a certain data points.

It's usually very small errors that I don't notice until I really study what it's telling me.

And also the opposite problem: A couple days ago I thought I saw an error (when really ChatGPT was right). So I said "No, that number is wrong, find the error" and instead of pushing back and telling me the number was right, it admitted to the error (there was no error) and made up a reason why it was wrong.

Hallucinations have gotten way better compared to a couple years ago, but at least ChatGPT seems to still break down especially when it's overloaded with a ton of context, in my experience.

shepherdjerred · 2026-03-09T21:44:38 1773092678

I've gotten better results by telling it "write a Python program to calculate X"

dmd · 2026-03-09T22:06:01 1773093961

Yeah, in my user prompt I have "Whenever you are asked to perform any operation which could be done deterministically by a program, you should write a program to do it that way and feed it the data, rather than thinking through the problem on your own." It's worked wonders.

brotchie · 2026-03-09T22:59:55 1773097195

For the tax thing. I had Claude write a CLI and a prompt for Gemini Flash 2.5 to do the structured extraction: i.e. .pdf -> JSON. The JSON schema was pretty flexible, and open to interpretation by Gemini, so it didn't produce 100% consistent JSON structures.

To then "aggregate" all of the json outputs, I had Claude look at the json outputs, and then iterate on a Python tool to programmatically do it. I saw it iterating a few times on this: write the most naive Python tool, run it, throws exception, rinse and repeat, until it was able to parse all the json files sensibly.

cj · 2026-03-09T21:59:01 1773093541

Good call. I’ve also had better results pre-processing PDFs, extracting data into structured format, and then running prompts against that.

Which should pair well with the “write a script” tactic.

tavavex · 2026-03-09T22:10:51 1773094251

Yeah, asking for a tool to do a thing is almost always better than asking for the thing directly, I find. LLMs are kind of not there in terms of always being correct with large batches of data. And when you ask for a script, you can actually verify what's going on in there, without taking leaps of faith.

arjie · 2026-03-09T21:58:11 1773093491

In my case, what I like to do is extract data into machine-readable format and then once the data is appropriately modeled, further actions can use programmatic means to analyze. As an example, I also used Claude Code on my taxes:

1. I keep all my accounts in accounting software (originally Wave, then beancount)

2. Because the machinery is all in programmatically queriable means, the data is not in token-space, only the schema and logic

I then use tax software to prep my professional and personal returns. The LLM acts as a validator, and ensures I've done my accounts right. I have `jmap` pull my mail via IMAP, my Mercury account via a read-only transactions-only token and then I let it compare against my beancount records to make sure I've accounted for things correctly.

For the most part, you want it to be handling very little arithmetic in token-space though the SOTA models can do it pretty flawlessly. I did notice that they would occasionally make arithmetic errors in numerical comparison, but when using them as an assistant you're not using them directly but as a hypothesis generator and a checker tool and if you ask it to write out the reasoning it's pretty damned good.

For me Opus 4.6 in Claude Code was remarkable for this use-case. These days, I just run `,cc accounts` and then look at the newly added accounts in fava and compare with Mercury. This is one of those tedious-to-enter trivial-to-verify use-cases that they excel at.

To be honest, I was fine using Wave, but without machine-access it's software that's dead to me.

ElFitz · 2026-03-09T22:24:16 1773095056

I’d say for these use cases it’s better to make it build the tools that do the thing than to make it doing the thing itself.

And it usually takes just as long.

thijsvandien · 2026-03-09T21:27:40 1773091660

I don't know, but I would never upload such sensitive information to a service like that (local models FTW!) or trust the numbers.

basch · 2026-03-09T21:43:17 1773092597

Which part is sensitive? Social is public, income is private but what is someone going to do with it?

AlecSchueler · 2026-03-10T14:29:07 1773152947

That's dream info for targeted advertising and political manipulation.

jumpman500 · 2026-03-09T22:30:12 1773095412

It's not good in some job negotiations if someone has a very clear picture of what your current net worth and income is. Also in some purchases companies could price discriminate more effectively against you.

thijsvandien · 2026-03-09T21:57:31 1773093451

Now that's a question I'd feel more confident having answered by an LLM. Personally, I'm tired of arguing with "nothing to hide", which (no offense) is just terribly naive these days.

whackernews · 2026-03-10T00:57:06 1773104226

I find it really weird too, like, haven’t we done this? Also struggle to understand the motivation for arguing from this direction. Do people forget it’s the normal, default position NOT to be spied on?

whackernews · 2026-03-10T01:00:25 1773104425

Where’s the line for you? Would you upload a picture of you sat on the toilet for example?

slopinthebag · 2026-03-09T21:57:20 1773093440

> had Claude prepare all of my docs for upload to my accountant: FinCEN reporting, summary of all docs, etc.

I imagine your accountant had the same reaction I do when an amateur shows me their vibe codebase.

mandeepj · 2026-03-09T21:57:37 1773093457

> Result was amazing. I now actually fully understand the tax position.

You couldn’t do that with TurboTax or block’s tax file? You don’t have to submit or pay.

generallyjosh · 2026-03-10T23:47:58 1773186478

Did it make any mistakes on your taxes?

Personally, I know coding pretty well. So when I'm using it for coding, I can spot most of its mistakes / misunderstandings

I would not trust using it on a complex domain I'm not super familiar with, like doing taxes

A mistake here is pretty high cost (getting audited, and/or having to pay a bunch in penalties)

MikeNotThePope · 2026-03-09T21:56:10 1773093370

Be careful with taxes. Hallucinations will cost you.

g947o · 2026-03-10T11:07:07 1773140827

We usually call that FAAFO

whattheheckheck · 2026-03-09T21:36:44 1773092204

I had ai hallucinate that you can use different container images at runtime for emr serverless. That was incorrect its only at application creation time.

Hope you dont get audited

brotchie · 2026-02-27T20:05:31 1772222731

The way I solved this was that my open claw doesn't interact directly with any of my personal data (calendar, gmail, etc).

I essentially have a separate process that syncs my gmail, with gmail body contents encrypted using a key my openclaw doesn't have trivial access to. I then have another process that reads each email from sqlite db, and runs gemini 2 flash lite against it, with some anti-prompt injection prompt + structured data extraction (JSON in a specific format).

My claw can only read the sanitized structured data extraction (which is pretty verbose and can contain passages from the original email).

The primary attack vector is an attacker crafting an "inception" prompt injection. Where they're able to get a prompt injection through the flash lite sanitization and JSON output in such a way that it also prompt injects my claw.

Still a non-zero risk, but mostly mitigates naive prompt injection attacks.

jakeydus · 2026-02-27T22:27:03 1772231223

That doesn’t sound like you solved it, that sounds like you obfuscated it. Feels a bit to me like you’ve got a wall around a property and people are using ladders to get in, so you built another wall around the first wall.

I recognize I’m being pedantic but two layers of the same kind of security (an LLM recognizing a prompt injection attempt) are not the same as solving a security vulnerability.

brotchie · 2026-01-20T00:46:29 1768869989

One trick that works well for personality stability / believability is to describe the qualities that the agent has, rather than what it should do and not do.

e.g.

Rather than:

"Be friendly and helpful" or "You're a helpful and friendly agent."

Prompt:

"You're Jessica, a florist with 20 years of experience. You derive great satisfaction from interacting with customers and providing great customer service. You genuinely enjoy listening to customer's needs..."

This drops the model into more of a "I'm roleplaying this character, and will try and mimic the traits described" rather than "Oh, I'm just following a list of rules."

makebelievelol · 2026-01-20T03:17:50 1768879070

I think that's just a variation of grounding the LLM. They already have the personality written in the system prompt in a way. The issue is that when the conversation goes on long enough, they would "break character".

alansaber · 2026-01-20T10:27:09 1768904829

Just in terms of tokenization "Be friendly and helpful" has a clearly demined semantic value in vector space wheras the "Jessica" roleplay has much a much less clear semantic value

brotchie · 2025-12-23T23:24:23 1766532263

You'd think the go-to workflow for releasing redacted PDFs would be to draw black rectangles and then rasterize to image-only PDFs :shrug:

selinkocalar · 2025-12-24T00:02:33 1766534553

As someone who's built an entire business on "anti-screenshots" this is brilliant.

PDF redaction fails are everywhere and it's usually because people don't understand that covering text with a black box doesn't actually remove the underlying data.

I see this constantly in compliance. People think they're protecting sensitive info but the original text is still there in the PDF structure.

embedding-shape · 2025-12-24T00:23:06 1766535786

Not to mention some PDF editors preserve previous edits in the PDF file itself, which people also seems unaware of. A bit more user friendly description of the feature without having to read the specification itself: https://developers.foxit.com/developer-hub/document/incremen...

shbooms · 2025-12-23T23:42:53 1766533373

often times you will have requirements that the documents you release be digitally searchable and so in these cases, this would not be an option

pottertheotter · 2025-12-24T01:23:44 1766539424

This made me think of something I came across recently that’s almost the opposite problem of requiring PDFs to be searchable. A local government would publish PDFs where the text is clearly readable on screen, but the selectable text layer is intentionally scrambled, so copy/paste or search returns garbage. It's a very hostile thing to do, especially with public data!

2ICofafireteam · 2025-12-27T01:32:03 1766799123

I have encountered PDFs that would exhibit this behavior in one browser but not in another.

One fun thing I encountered from local government is releasing files with potato quality resolution and not considering the page size.

I had a FOI request that returned mainly Arch D sized drawings but they were in a 94 DPI PDF rendered as letter sized. It was a fun conversation trying to explain to an annoyed city employee that putting those large drawings in a 94 DPI letter size page effectively made it 30-ish DPI.

eviks · 2025-12-24T07:20:22 1766560822

Hostile indeed, and also happens in user-facing documents like product manuals!

8note · 2025-12-24T00:00:52 1766534452

run some ocr on them after to recreate the text layer?

albert_e · 2025-12-24T05:30:16 1766554216

With the aggressive push of LLMs and Generative AI ..i am expecting a lot of OCR features to become "smarter" by default, namely go beyond mechanical OCR and start inserting hallucinations and sematically/contextually "more correct" information in OCR output

It's not hard to imagine some powerful LLMs being able to undo some light redactions that are deducible based on context

blharr · 2025-12-24T21:42:58 1766612578

Or worse, making up names or information instead of writing the reaction.

brotchie · 2025-12-15T23:39:14 1765841954

Did a similar back-of-the-napkin and got 5x $ / MW of orbital vs. terrestrial. This article's analysis is ~3.4x.

I do wonder, at what factor of orbital to terrestrial cost factor it becomes worthwhile.

The greater the terrestrial lead time, red tape, permitting, regulations on Earth, the higher the orbital-to-terrestrial factor that's acceptable.

A lights-out automated production line pumping out GPU satellites into a daily Starship launch feels "cleaner" from an end-to-end automation perspective vs years long land acquisition, planning and environment approvals, construction.

More expensive, for sure, but feels way more copy-paste the factory, "linearly scalable" than physical construction.

notahacker · 2025-12-16T00:07:46 1765843666

It becomes worthwhile if its actually cheaper (probably significantly cheaper given R&D and risk), or if you're processing data which originates in space and the data transfer or latency is an issue

You can set up plant manufacturing chips in shipping containers and sending them to wherever energy/land is cheapest and regulation most suitable, without having to seek the FCCs approval to get launch approved and your data back...

nick486 · 2025-12-16T07:48:56 1765871336

people use aws despite it being 2x-10x the cost of self hosting. cost isnt everything.

brotchie · 2025-09-30T05:21:23 1759209683

+100000 to

A hybrid of Strong (the lifting app) and ChatGPT where the model has access to my workouts, can suggest improvements, and coach me. I mainly just want to be able to chat with the model knowing it has detailed context for each of my workouts (down to the time in between each set).

Strong really transformed my gym progression, I feel like its autopilot for the gym. BUT I have 4x routines I rotate through (I'll often switch it up based on equipment availability), but I'm sure an integrated AI coach could optimize.

siddboots · 2025-09-30T05:36:44 1759210604

I do this at the moment in my hand rolled personal assistant experiment built out of Claude code agents and hooks. I describe my workouts to Claude (among other things) and they are logged to a csv table. Then it reads the recent workouts and makes recommendations on exercises when I plan my next session etc. It also helps me manage projects, todos, and time blocked schedules using a similar system. I think the calorie counter that the OP describes would be very easy to add to this sort of set up.

brotchie · 2025-09-29T03:47:54 1759117674

The question that really matters: is the net present value of each $1 investment in AI Capex > $1 (+ some spread for borrowing costs & risk).

We'll be inference token constrained indefinitely: i.e. inference tokens supply will never exceed demand, it's just that the $/token may not be able to pay back the capital investment.

nextworddev · 2025-09-29T04:13:21 1759119201

And it may not need to pay back with enough fiscal stimulus

chii · 2025-09-29T04:05:14 1759118714

> it's just that the $/token may not be able to pay back the capital investment.

the loss is private, so that's OK.

A similar thing happened to the internet bandwidth capacity when the dot-com bust happened - overinvestment in fibre everywhere (came to be called dark fibre iirc), which became superbly useful once the recovery started, despite those building these capacity not making much money. They ate the losses, so that the benefit can flow out.

The only time this is not OK is when the overinvestment comes from gov't sources, and is ultimately a taxpayer funded grift.

johncolanduoni · 2025-09-29T04:17:26 1759119446

Investment in dark fiber was intentional and continues to this day. Almost all of the cost for laying fiber is in getting physical access to where you want to put the fiber underground. The fiber itself is incredibly cheap, so every time a telecom bothers to dig up mile upon mile of earth they overprovision massively.

The capital overhang of having more fiber than needed is so small compared to other costs I doubt the telecoms have really regretted any of the overprovisioning they've done, even when their models for future demand didn't pan out.

yojo · 2025-09-29T04:59:10 1759121950

Every time someone says “but dark fiber”, someone else has to point out that graphics cards are not infrastructure and depreciate at a much, much higher rate. I guess it’s my turn.

Fiber will remain a valuable asset until/unless some moron snaps it with a backhoe. And it costs almost nothing to operate.

Your data center full of H100s will wear out in 5 years. Any that don’t are still going to require substantial costs to run/may not be cost-competitive with whatever new higher performance card Nvidia releases next year.

typewithrhythm · 2025-09-29T05:15:41 1759122941

How does the cost breakdown of all these new datacenters, cooling, and power delivery systems compare to the cost of the GPUs themselves?

There is a surprising amount of real long-term infrastructure being built beyond the quickly obsolete chips.

BlarfMcFlarf · 2025-09-29T05:21:26 1759123286

Poorly. GPUs are easily the bulk of the costs, and a disposable asset.

musebox35 · 2025-09-29T05:22:17 1759123337

That is a fine point. However I am not sure if replacing the gpus themselves will be the bottleneck investment for datacenter costs. After all you have so much more infrastructure in a datacenter (cooling and networking). Plus custom chips like tpus might catch up at lower cost eventually. I think the bigger question is whether demand for compute will evaporate or not.

wmf · 2025-09-29T06:03:17 1759125797

When the bubble pops the labs are going to stop the hero training runs and switch the gigawatt datacenters over to inference and then they're going to discover that milking existing GPUs is cheaper than replacing them.

datadrivenangel · 2025-09-29T04:27:22 1759120042

Softbank investment funds include teacher pension plans and things like that. Private losses attached to public savings can very quickly become too big to fail.

tjwebbnorfolk · 2025-09-29T04:32:06 1759120326

Nobody forced a pension plan to invest in Masa's 300 year AI vision or whatever. Why it's even legal to gamble pensioners' money like that is beyond me.

SwellJoe · 2025-09-29T06:40:29 1759128029

There's a lot of retirement funds tied up in heavily AI-exposed stocks. A crash, which seems inevitable to me, will hit the public pretty hard.

orbital-decay · 2025-09-29T04:16:27 1759119387

How is infrastructure building at a loss not OK for the government? That's what governments do, including things that will never be profitable.

johncolanduoni · 2025-09-29T04:19:23 1759119563

I don't think merely building infrastructure at a loss is what's being described here - it's building infrastructure that won't get used (or used enough to be worth it). More of a bridge to nowhere situation than expecting to recoup the cost of a bridge with tolls or whatever.

friendzis · 2025-09-29T06:00:24 1759125624

Infrastructure building at a loss is very much not okay for a government and is usually the result of some form of corruption (e.g. privatize the profit), incompetence (e.g. misaligned incentives) or both.

However, the cost-benefit analysis on governmental projects typically includes non-monetary or indirect benefits.

brotchie · 2025-09-20T16:40:14 1758386414

What’s the downside here? Lithium ion batteries have an energy density of 150-350 Wh/kg, so this is firmly at the bottom of that range.

Naive, back of the napkin is 446 kWh / m^3. There’s a lot of content out there!

imtringued · 2025-09-20T17:09:00 1758388140

I haven't read the paper in detail yet but the easiest way to cheat is to calculate the density of a single layer, "capacitor plate" or surface or whatever it is that the microbes are living on and consider the "structural" cement as not counting towards the density calculation because theoretically speaking, there could be a manufacturing method to make a cement that creates the promised surface area even though such a process would be completely impractical to commercialise.

grues-dinner · 2025-09-20T16:42:35 1758386555

"Sorry the power is out - the concrete got sick".

And ensuring electrode integrity is probably quite fiddly during construction and maintenance is presumably also fiddly.

brotchie · 2025-09-11T23:23:51 1757633031

+1, spot on description of aphantasia.