tabwidth's comments

tabwidth · 2026-03-19T22:16:30 1773958590

The intention part is right but the bottleneck is review. AI is really good at turning your clean semantic functions into pragmatic ones without you noticing. You ask for a feature, it slips a side effect into something that was pure, tests still pass. By the time you catch it you've got three more PRs built on top.

peacebeard · 2026-03-19T22:25:49 1773959149

In my experience trying to push the onus of filtering out slop onto reviewers is both ineffective and unfair to the reviewer. When you submit code for review you are saying "I believe to the best of my ability that this code is high quality and adequate but it's best to have another person verify that." If the AI has done things without you noticing, you haven't reviewed its output well enough yet and shouldn't be submitting it to another person yet.

skydhash · 2026-03-19T23:56:01 1773964561

Code review should be a transmission of ideas and helping spotting errors that can slip in due to excessive familiarity with the changes (which are often glaring to anyone other than the author).

If you're not familiar with the patch enough to answer any question about it, you shouldn't submit it for review.

tabwidth · 2026-03-19T06:05:10 1773900310

Yeah the raw parse speed comparison is almost a red herring at this point. The real cost with JSON is when you have a 200MB manifest or build artifact and you need exactly two fields out of it. You're still loading the whole thing into memory, building the full object graph, and GC gets to clean all of it up after. That's the part where something like RX with selective access actually matters. Parse speed benchmarks don't capture that at all.

magicalhippo · 2026-03-19T12:23:39 1773923019

> The real cost with JSON is when you have a 200MB manifest or build artifact and you need exactly two fields out of it.

There are SAX-like JSON libraries out there, and several of them work with a preallocated buffer or similar streaming interface, so you could stream the file and pick out the two fields as they come along.

IshKebab · 2026-03-19T13:22:41 1773926561

You still have to parse half the entire file on average. Much slower than formats that support skipping to the relevant information directly.

creationix · 2026-03-19T14:04:46 1773929086

yep, this is exactly the kind of use case that caused me to design this format.

xxs · 2026-03-19T09:10:02 1773911402

as parser: keep only indexes to the original file (input), dont copy strings or parse numbers at all (unless the strings fit in the index width, e.g. 32bit)

That would make parsing faster and there will be very little in terms on tree (json can't really contain full blow graphs) but it's rather complicated, and it will require hashing to allow navigation, though.

creationix · 2026-03-19T17:45:32 1773942332

yep. I built custom JSON parsers as a first solution. The problem is you can't get away from scanning at least half the document bytes on average.

With RX and other truly random-access formats you could even optimize to the point of not even fetching the whole document. You could grab chunks from a remote server using HTTP range requests and cache locally in fixed-width blocks.

With JSON you must start at the front and read byte-by-byte till you find all the data you're looking for. Smart parsers can help a lot to reduce heap allocations, but you can't skip the state machine scan.

tabwidth · 2026-03-16T22:33:11 1773700391

The part that gets me is when it passes lint, passes tests, and the logic is technically correct, but it quietly changed how something gets called. Rename a parameter. Wrap a return value in a Promise that wasn't there before. Add some intermediate type nobody asked for. None of that shows up as a failure anywhere. You only notice three days later when some other piece of code that depended on the old shape breaks in a way that has nothing to do with the original change.

tabwidth · 2026-03-12T23:16:40 1773357400

The worst version of this I've seen is when every layer is like four lines long. You step into a function expecting some logic and it's just calling another function with slightly different args. Do that six times and you forgot what the original call was even trying to do. Naming helps in theory but in practice half those intermediate functions end up with names like processInner or handleCore because there's nothing meaningful to call them.

api · 2026-03-13T00:52:49 1773363169

Any pattern executed robotically like this becomes a self parody.

structural · 2026-03-13T20:47:00 1773434820

One heuristic I use to avoid this exact problem is "minimize the number of places that the next poor soul has to look in order to understand how this code works", where place is loosely defined as about the number of lines of code that fit on a screen or two.

This has given really good results in terms of helping decide whether to extract these helper functions or not - they have to both be memorable enough in name and arguments that the code calling them can understand what's going on without always having to dive in, and also provide a meaningful compression of the logic above so that it can be comprehended without having to jump across many hundreds of lines.

syockit · 2026-03-13T12:58:47 1773406727

It'd be great if IDEs can store a stack of functions currently being explored similar to what you get when debugging. Not breadcrumbs, but plain stack. Bonus points if you can store multiple stacks, and give them names according to the context.

iterance · 2026-03-13T02:27:38 1773368858

Ah. I once worked in a team with a hard cyclomatic complexity cap of 4 per function. Logic exceeding the cap needed to be broken into helper functions. Many, many functions were created to hold exactly one if statement each. Well, the code was relatively high quality for other reasons, but I can't say this policy contributed much.