Author is top notch in my book. I'm a sucker for someone taking a complex problem and distilling out a simple solution. I don't know of higher praise to give a developer.
I wondered if there might be a no brainer "free" option on discarded hardware.
I have a GTX1080ti which i think is circa 2018, it's unused, more than paid for itself over the years, owes me nothing at this point so the hardware is free.
It runs Gemma e4b multimodal, qwen 3.5 8b or the qwen 4b embeddings models well enough (40+ t/s for the LLMs).
The machine consumes 350 watts at the wall when under load (3 watts when sleeping, 80w at idle). Electricity costs me £0.035GBP/kwh which is cheap for the UK (load shifting via house battery).
144k output tokens for around 1pence (and takes an hour to do that in theory).
It's only JUST cheaper to use than the far more capable deepseek v4 flash model despite the free hardware and ~10x cheaper than normal electricity.
By quirk of fate i've spent the past 2 days prototyping some stuff on pdfjs. Just trying to figure out a game plan for handling bounding boxes in the face of page zooming, different resolutions etc. etc. I can't see it mentioned whether the components are virtualising pages (as in reusing dom elements as document pages scroll by). I guess i just learned what i'll be exploring tomorrow then...
I bought the refreshed M5 version with the new headstrap. I read so many complaints about weight and it was just never an issue for me personally. Maybe the new strap is that much better?
That said, the battery cable was super annoying, id accidentally catch it multiple times per day. The battery is good for less than 2 hours so i used it plugged into the wall.
For zoom calls, the persona thing is hilariously bad, unusable in a business context. Interesting for a few minutes as a tech demo though.
The virtual layout is good - a big citrix app screen (its the ipad app) for remote desktop, zoom, safari etc off to the sides and then things like calendar widget pinned to physical wall. But text clarity / quality is just slightly not good enough for software development. Almost, its close. If you dont mind large fonts its good enough.
Ultimately returned it but it was a close run thing, i almost kept it.
I do still hanker for something like this, tempted to try xreal or other glasses but seems like the PPD is even lower.
It's a shame because this is the best visual fidelity i think of all the devices.
I managed several days back to back, it's very like 1440p on a 27" and millions of people use that every day productively but when you're spending that kind of money, i don't want £200 monitor quality.
Careful using body-worn devices when plugged in. Medical power supplies have special requirements to avoid electrocution, because they are often powering equipment in contact with a person's body. Consumer power supplies probably don't, on the assumption that the device will not be charging whilst being worn.
People have died from using headphones plugged into USB chargers.
At least here in the EU they do and i think it's the case in most countries world wide?
Looks like that 2014 case might have been a sub-standard charger that didn't conform to required safety standards?
I actually experienced a (cheap) charger exploding under my desk back around 2015.
There's a youtuber called BigClive that delights in tearing down the bad chargers. Ken Shirrif's blog is the best resource i know of for this topic though.
Anyway, i feel pretty safe with the vision pro plugged in with an apple charger. The battery does get warm though...
In that case it was a cheap USB supply, though there are other reports of similar. "Good" consumer power supply are designed to IEC standards, but to keep the cost down they are different standards to those used for wearable medical equipment. Medical equipment has to conform to IEC 60601, which governs things like electrical isolation and safe failure modes.
> i feel pretty safe
So be it. It is unlikely, but even a good power supply can fail in the face of a voltage surge on the mains. Absolutely don't wear it plugged in if there is any hint of a thunderstorm!
Zoom itself works absolutely fine, it's just the ipad app you get on vision pro. My complaint is what happens when you turn your camera on - meeting participants see an uncanny valley representation of yourself - your "Persona" which you scan in when you get the device.
I clearly identify with the problem the author raises, which is: the bottleneck is understanding.
I don't go along with their mitigations though.
In programming we have one tool for this: abstraction. Decomposition, pattern recognition, even data structures and algorithms are all down stream of abstraction. Collectively, we've never truly mastered abstraction, but it's what we have and we collectively wield it well enough that it's usually somewhat effective.
The "right" abstraction seems like quite an art. Sometimes it's not obvious, or it takes multiple rounds of exploration and testing (I'm thinking here of the mental shift moving from HTML + JS, via jQuery, Backbone, Knockout and up to React/Vue or Angular). At all points, we thought we had reasonable abstractions for a while. Vue and Svelt, or NextJS, now are so far from the mental model of early 00s "DHTML".
And I'm not sure how this relates to TFA's point. Are you saying we collectively need to get better at abstraction so that LLMs get better at abstraction (either by training, or our prompting), so that their code is easier to read?
>> I’m losing control over the code I write when I work with agentic code generation
> Are you saying we collectively need to get better at abstraction so that LLMs get better at abstraction (either by training, or our prompting), so that their code is easier to read?
No - our current abstraction for coding agents is a loop where we express some freeform specification of a goal, then a sub loop kicks off where an llm takes a stab at what good looks like for the next step (make an edit, search for info, run a command to cause some side effect etc etc), it iterates in this loop and when it's finished its sub loop, it declares end of turn and the loop returns to the user for steering input.
That inner agent loop can make it quite hard to stay in control.
What if instead of only these low level free form prompts we additionally had some higher level primitives to work with?
English to program is the same kind of abstraction as going from assembly to program to C to program to high level scripting to "4G languages" and so on, and hiding all kinds of details behind a much terser layer. It's just that it's a qualitative jump at it.
Here's a ChatGPT provided answer asked to "define abstraction in programming, like when going from assembly to C to scripting, etc":
"Abstraction in programming is the process of building layers where each layer hides the details of the layer below".
In math/computers abstraction has a technical meaning that requires deterministic behavior for the abstraction to work. It isn't a proper abstraction if it doesn't always do the same thing underneath.
Abstraction as a layering idea without regard to how it works is like the pop-pych version in that it is "right" but misses nuance.
I’ve had both - the 380 is much lighter to carry around than the 480.
I wouldn't recommend either though, for both, the keys are not nice to type on if you don’t press perfectly downward, if you have any angle other than vertical, the keys occasionally bind a little. This is amplified on the 480 with longer key travel. They’re different types of key mechanisms on both but suffer the same problem.
If you have any kind of case, the 480 stand slot can be harder to use.
I prefer "the bottleneck is understanding" framing.
The author is nibbling at the same problem ultimately, but i don't think "hey one strategy is we could just let cognitive debt accumulate so we can go faster!" is a particularly insightful tool in the toolbox. Don't misread me, i'm not denying it can be a valid strategy.
Instead i want to read about insightful strategies for optimising that system-wide bottleneck we have: understanding.
Tell me about how you managed to shift to a higher level of abstraction, tell me about how and when that abstraction leaks. Tell me how you reduced the amount of information that has to flow through the system bottleneck.
It’s making guesses not decisions, framing as decisions will lead you astray to wasted time and tokens.
It’s vaguely productive to tell them a ton of relevant info upfront attempting to minimise their need for load bearing guesses. I say vaguely because obedience is generally only around the level where it's good enough to lull you into a false sense of security, not to actually be obedient.
It’s a bit more productive to use the various loop mechanisms (hooks, /goal etc) to evaluate each end of turn against guard rails and reject with clear instruction on whats unacceptable. Obviously if you only do this without the front load of info then you’re likely to spend more tokens to reach a satisfactory end of iteration.
Author is top notch in my book. I'm a sucker for someone taking a complex problem and distilling out a simple solution. I don't know of higher praise to give a developer.
reply