More

stult · 2026-06-12T22:10:21 1781302221

I think warfighter crept into the lexicon for somewhat understandable reasons, likely because of the increasing frequency of joint operations (i.e., operations involving more than one branch of the military working together) after Vietnam, combined with the long-standing military tradition whereby members of any given branch take great offense if you refer to them using the wrong professional label (i.e., soldier, sailor, ~crayon-eater~ marine, airman, space cadet). That is, we can't just call all of them soldiers because only members of the Army are soldiers, so if for example you call a mixed group of marines and soldiers "soldiers", the marines will make their displeasure known to you, aggressively and in no uncertain terms.

When you're talking about DoD stuff all day long and frequently need to refer generically to the mixed personnel involved in a joint operation, warfighters beats saying Soldier-Sailor-Marine-Airman-Spacecase. All the other alternative phrases for the concept of "person employed by the military in one of the five combat arms branches" are variations on "member" and tend to sound clunky or be overly verbose, like "service member" or "member of the military." Try saying "service members" 50 times per day. Trust me, it gets old fast.

And frankly I don't see the problem with warfighter. Fighting wars is quite literally what they do, and pretending otherwise does a disservice to the truth and risks papering over the deadly seriousness of their work. Warfighter is also quite distinct from "warrior," which carries connotations of a specifically aggressive and barbaric flavor of professional violence purveyor. Like you say, it sounds like some atavistic hereditary soldier caste for whom violence is a sacred vocation joyfully undertaken rather than a solemn duty carried out only with great reluctance and forbearance.

FireBeyond · 2026-06-13T00:21:03 1781310063

> When you're talking about DoD stuff all day long and frequently need to refer generically to the mixed personnel involved in a joint operation, warfighters beats saying Soldier-Sailor-Marine-Airman-Spacecase. All the other alternative phrases for the concept of "person employed by the military in one of the five combat arms branches" are variations on "member" and tend to sound clunky or be overly verbose, like "service member" or "member of the military." Try saying "service members" 50 times per day. Trust me, it gets old fast.

If only you hadn't found the perfect word in your description of the "problem": they are "personnel".

Eddy_Viscosity2 · 2026-06-13T13:14:59 1781356499

"personnel" is too broad though as it could include all of the civilian support staff, admin staff, contractors, etc. Sometime you do want a term to collectively include all of these people, but sometimes you want to just refer to the ones actively doing the military bits and not the support bits.

shawn_w · 2026-06-13T23:01:58 1781391718

Troops?

stult · 2026-06-14T23:36:55 1781480215

Troops is almost exclusively used to refer to Army personnel in the US.

Eddy_Viscosity2 · 2026-06-14T13:59:42 1781445582

Actually, yeah that works

xethos · 2026-06-13T13:32:26 1781357546

> And frankly I don't see the problem with warfighter

Because it implies, quite pointedly, that the United States will never send a peacekeeping force.

warumdarum · 2026-06-13T18:10:55 1781374255

But even that peace keeping will involve active combat, unless the mission fails or the force involved is so capable ot deters opponent. You dont want to end up like the blue helmets in lebanon and more like nordbat in kosovo.

xethos · 2026-06-15T01:59:33 1781488773

That still means the main point, the reason they are there, will never be peace. They are there to fight, and to fuck your shit up - which is both an image the current administration embraces, and very much not a good look outside the country.

stult · 2026-06-07T04:57:23 1780808243

> - Do we have reasons to care about LOC in a world where we don't write code manually? What happens to token usage numbers when the codebase is significantly larger?

Yes, at least to the extent that we care about context windows and tokens consumed by coding agents processing code that is ultimately irrelevant to their assigned task.

Anecdotally, I've found keeping file sizes small has been important for agentic coding not just to maintain human readability, but also for optimizing agent performance, precisely because it limits the amount of incidental context they load while working a problem, because they generally load entire files rather than just parsing the part relevant to their current assignment as a human might. That smaller file size thus reduces input noise and the LLM generates a tighter solution, which in turn reduces input noise for future solutions. Or at least this strategy avoids a death spiral into exploding context length.

I expect (but cannot currently prove) that keeping overall LOC down yields similar benefits even when file sizes are kept small because it spares the LLM from parsing potentially relevant files that prove irrelevant to its current task.

everforward · 2026-06-07T14:01:17 1780840877

Seconded on smaller files. I feel like I tend to get better responses faster.

A notable flaw here is that I’ve not tried large vs small files in a large codebase. Most of my experimentation there has been on personal projects where even a small file contains a significant part of the project. I could see degradation when it has to load 5 files to figure out how something works.

Total LOC (tokens, really, literal lines probably don’t matter) is interesting as a factor. That might go some way towards explaining why LLMs are weirdly good at Clojure.

Eg last I checked Anthropics one-shot performance on Clojure was about the same as Python or Go despite almost certainly being less represented in training data. The combination of density and simple primitives might be easier for an LLM to wrangle, ameliorating the impact of a less popular language.

torben-friis · 2026-06-07T14:56:51 1780844211

>Eg last I checked Anthropics one-shot performance on Clojure was about the same as Python or Go despite almost certainly being less represented in training data. The combination of density and simple primitives might be easier for an LLM to wrangle, ameliorating the impact of a less popular language.

There might be tons of confounding factors there. One that comes to mind is the quality of of data, it might perfectly be that the average clojure snippet is higher quality, due to the users demographics. Very few people start writing code with clojure, whether in college or during bootcamps.

everforward · 2026-06-08T14:56:43 1780930603

Oh there absolutely are, I don’t mean to imply any certainty in that attribution.

Quality of data is totally one. Immutability may be another (it’s easier to reason about if you don’t have to track mutations to a variable). Another interesting one is Clojures emphasis on composability using basic primitives that are sort of hard to grok initially but unlock really cool stuff.

You can do some incredible stuff with recursive map and arrow functions in a few dozen characters.

stult · 2026-06-03T02:26:56 1780453616

I would agree with this point and as I explained in a comment replying to the GP comment above, that atrophy is far more dangerous in the legal field than it is with code because legal documents do not benefit from the structural safeguards available for code, like automated testing, static typing, static analysis tools, etc. IME with legal LLMs so far, they are easily in that most dangerous valley where they can lull you into a false sense of security while still introducing extremely dangerous mistakes that are frequently difficult to detect without very careful reading.

The danger of those mistakes creeping in also grows exponentially the farther a lawyer strays from their core legal expertise. There are a few statutes I know inside and out, and I can spot LLM analytical errors related to them in a split second, but once I venture out into domains where I am not an expert (but where I am nevertheless reasonably qualified to practice), it becomes much harder to spot drafting mistakes because I have not refreshed my own understanding of the law by reviewing the relevant cases or statutes as I would when drafting the analysis myself from scratch.

stult · 2026-06-03T02:18:11 1780453091

IME so far (as both a lawyer and a software engineer), LLM error rates when drafting code and legal documents are reasonably comparable, but it's more problematic in the legal context because legal documents do not benefit from many of the structural safeguards available for code. For legal documents, there are no automated tests, no static typing, no test environments, no logging/observability instrumentation, no sandboxing.

The time lag between drafting and "deployment" also makes for much less effective, much more expensive debugging loops. You can deploy your code to prod in seconds, see an error pop up in the logs, and immediately start debugging. But it will take at a minimum days and frequently as long as several years before an error in a contract or a court filing will be detected, and often the error is beyond correction at that point. Thus, the errors are both more difficult to detect and to resolve.

And the consequences of error are often much greater, both because they are not correctable and because a legal error may risk someone's life, liberty, or substantial property. Although that's not categorically the case, obviously bugs in certain safety critical systems can be as bad or even worse than legal mistakes. But in general, most software is lower stakes than most legal writing.

On the flip side, LLMs do seem to do a better job with basic style and structure for legal documents compared to code. Things like following IRAC format, citing assertions of law (although hallucination remains an issue), and writing comprehensible sentences. These would be the equivalents in code to best practices like good comments, cohesion, consistent use of design patterns, test coverage, clear variable names, DRY, etc. Although the better performance on those more qualitative metrics may just be because even the longest legal documents are typically simpler in structure and have fewer lines of text than a large, complex codebase. Or maybe it's because LLMs are trained on natural language text more than on code. Or because natural language is more forgiving than code, in that minor variation in diction or grammar is unlikely to have any significant effect on how the document is interpreted, whereas even single character errors in code can have enormous effects.

Otterly99 · 2026-06-03T12:50:15 1780491015

There is also one thing I would like to add, and you can correct me if you disagree: coding benefits much more from thorough planning. Now, I exclusively work by first writing a plan that has well-defined steps and goals, which can of course change over time.

It seems to me like it would be more difficult to achieve with legal documents and, in my experience at least, writing a concrete plan has been the decisive factor that make my AI coding robust (plus all that you mentionned).

stult · 2026-06-03T20:56:53 1780520213

I'm not sure about that, I actually think planning may be just as important in both domains. Outlining before drafting is an almost universal best practice in legal writing that is drilled into law students to the point that outlining as exam prep is something students spend several weeks on each semester. So personally I always have a fairly detailed implementation plan in the form of an outline before I ask an LLM to draft a more detailed legal document.

I've also adopted an AI coding workflow that involves a lot of planning, although I actually write very little of the plan myself anymore. I have a chain of slash commands like this: create-issue -> plan-issue -> build-plan -> pr-into-dev. I write a relatively brief description of what I want accomplished to create the issue, and then the agent fleshes out my description with more detailed requirements and acceptance criteria. I review the issue description, and the LLM often identifies open questions I failed to consider, so I revise as necessary and then the agent posts the description to the GH issue. I have planning separated because I often create issues quickly when something occurs to me and then circle back at a later date to implement, and want the agent to create the concrete implementation plan with an up-to-date snapshot of the code in context. Then I review that again, adjusting as necessary, and then the agent posts the result as a comment on the original issue.

Like you, I've found this detailed planning makes for a very robust coding agent (again, also in combination with the aforementioned best practices, especially requiring 100% test coverage because forcing it to exercise every line of code avoids hallucinated dummy tests that assert on nothing). Interestingly in comparison to legal writing, I also rely on the agent to decompose complex tasks into separate issues or subissues as appropriate, which is something that is never necessary for legal analysis because pretty much every every legal analysis can be one-shotted.

For legal writing, my workflow is nowhere near as structured as that. For context, I have only ever used LLMs for drafting what are effectively emails to clients or memoranda of law for clients that are a step up in complexity and formality from an email. So not something that will be filed with a court necessarily but very much in the same format and style as a formal motion that would be submitted to a court on behalf of a client. And never a contract, will, or judicial opinion, nor a communication with a counterparty like a demand letter or C&D. So YMMV for other types of legal writing.

That said, I typically start drafting a memo by conversing casually with an agent to explore the general boundaries of an issue I am evaluating, by identifying relevant sources of law, potentially related issues, and the analytical process I need to follow (i.e., what issues to evaluate and what order to evaluate them in, more or the less the analytical "algorithm"). Once I have a good sense of that algorithm, I put together a high level outline and then ask the agent to draft a detailed memo around that outline. Or at least that's what I used to do before the last few months, since when the models have matured to the point where I increasingly just ask the agent to write the outline based on the conversation we had, then review that, then ask it to write the memo based on the outline.

As I have been writing this, it occurs to me that actually I am following almost the exact same process for writing code and for writing legal memos, and should probably distill the legal writing process into a similarly well-structured set of chained skills/slash commands. In both domains, I describe an issue at a high level, get the LLM to fill in some of the broad outline level details, review that, then get the LLM to implement the complete final product. (Also perhaps worth noting while I do occasionally conduct general high level research by talking to a frontier lab LLM, I have always used locally hosted OS/OW models for drafting memos where I need to provide concrete, specific factual information about clients to the LLM, to avoid attorney-client privilege issues, so the quality has lagged behind the frontier models, which is part of why I haven't developed this workflow into as structured of an approach as I have for coding).

In both coding and legal contexts, I think that this planning or outlining step is critical not (or not just) because it forces the agent to create a higher quality product, but because it forces me to review what I am asking the agent to do at a sufficiently detailed level that I can catch errors before they crop up in the implementation. A lot of the time, the errors that occur if I skip this step aren't because the LLM has made any clear mistake, but because I failed to specify some aspect of the task and the LLM is forced to guess at what I really intended, which is where agents often struggle.

So I guess I would tentatively suggest that legal writing does in fact benefit from thorough planning, though it is hard for me to quantify whether those benefits are greater or less than the comparable benefits for code.

Hfuffzehn · 2026-06-03T12:28:58 1780489738

This is a very good comment. But notice how even in software engineering there is still disagreement about these structural safeguards.

So yes, we can say the LLM created bad code when it does not compile or fails prewritten tests.

But experts might disagree what good comments, good cohesion, appropriate use of design patterns, appropriate test coverage or clear variable names are.

So what are we suppossed to train the LLMs towards? Somebody still has to decide what "good" is.

causal · 2026-06-03T11:41:10 1780486870

Hidden gem of a comment, thanks for writing

calvinmorrison · 2026-06-03T02:20:49 1780453249

Well this is largely the fault of law itself. especially english style law. A legal, parseable code, in which not every single tiny municipality (some less than 1 square mile) has their own set of rules and laws, not all published or available - but which citizens are expected to abide by of course - how could we expect AI to do well and not some typical TV southern lawyer who knows the judge?

stult · 2026-06-03T02:43:53 1780454633

I could not agree more. A simple example: it boggles my mind how every state organizes their statutes in entirely dissimilar ways. I'm not sure there's a need for every state to have slightly different wording for a murder statute in the first place, but even assuming there is, why do they all have to be scattered around in different code sections instead of every state just following some consistent convention like always putting the murder statute at Title V, Section 1.4 (or whatever the case may be, that's just a random invented example).

For murder that's not such a huge deal because the statutes are typically easy to track down and don't really differ all that much substantively, but once you get really into the weeds on something like commercial contracts it can be a huge pain to do cross-jurisdictional research.

And that's just a tiny, super obvious example of how impenetrable statutory law is, which isn't even the really pernicious problem. Case law is infinitely worse. It makes me absolutely furious how difficult legal research still is. The Westlaw/LexisNexis duopoly is a moral crime and wildly destructive to the quality of government in this country. Every single written court opinion should be publicly available for free on the internet in an easily searched format. It would cost practically nothing to achieve. We're talking about less text than Wikipedia hosts. Yet still many states make it almost impossible to access case law. Even though these cases are law. Binding law that we are supposed to follow, yet we cannot even easily access. It's insane, and largely perpetuated by the complacency of lawyers who can charge others for what should be free, the lobbying of the duopoly, and the incompetence of politicians.

If all of the laws were consistently available and stored in reasonable, consistent citation formats (I would settle for hyperlinking as a replacement for the rat's nest of wildly varying jurisdiction-specific citation systems), it would even be possible to introduce a form of unit testing for legal drafting that would allow us to automatically verify if the LLM hallucinated a citation.

It also doesn't help that we (for what were at the time very good reasons) moved away from the system of legal writs that used to provide fairly standardized, almost "cut and paste" templates for legal filings. So now every legal document (filings, memos, contracts, court opinions, statutes) is drafted like a bespoke, artisanal creation with few strict structural or stylistic conventions. That makes automated interpretation much harder than it needs to be.

stult · 2026-06-03T01:40:58 1780450858

That's not true at all. Modern legal education has focused on plain English drafting and avoidance of arcane jargon precisely to make legal documents comprehensible to non-specialists. There are almost no situations where legal drafting requires use of jargon. Jargon is pretty much only necessary where the domain requires use of jargon. Contracts are meant to be followed by the parties, and if the parties can't understand the terms of the contract because of obscure drafting, they can't abide by the terms.

Also legal language is in no way a programming language. And I would know, I'm a lawyer and a software engineer. It would actually be a dramatic improvement if lawyers were more consistent in their use of terms of art, but in practice there are very few terms of art that aren't either in general use or easily understood with a brief definition, and none are defined with anything like the precision or consistency of a programming language.

jhbadger · 2026-06-03T11:37:16 1780486636

I think you overestimate how much the average person can understand opaque jargon like "party of the first part". I'm sure good legal writing can avoid these things, but often (such as in the licenses people are theoretically supposed to click on that they have read and agree to for software), the opaqueness is the point -- they don't really want the user to understand what they are agreeing to.

ludicrousdispla · 2026-06-03T17:15:10 1780506910

"comprising" and "consisting of" have very different meanings in patent law, but I expect most people would consider them synonymous.

stult · 2026-06-01T04:08:57 1780286937

The R&D tax credit change actually took effect in 2022, and one of the few good things Trump's BBB did was reverse it

tczMUFlmoNk · 2026-06-01T06:21:43 1780294903

This is true as stated. However, it is important context that the time bomb was originally introduced in Trump's signature Tax Cuts and Jobs Act in his first term. So, yes, Trump's OBBBA fixed it, but Trump's TCJA caused it in the first place, too.

irishcoffee · 2026-06-01T14:31:59 1780324319

This is a fair criticism and I am not defending the practice. My understanding is that time-bombs like this are very regularly introduced into all sorts of bills, party-agnostic. It's how they can say things like "We saved $X over Y years!" where a lot of the time-bombs go off half-way through the 'Y-years" bit unless renewed.

Please correct if I'm wrong about this. I only know what I read, which is hard to trust anymore.

vineyardmike · 2026-06-01T16:27:54 1780331274

They’ve been regularly introduced in a party-agnostic way, exclusively by republicans. But yea, “party agnostic”.

irishcoffee · 2026-06-01T20:20:02 1780345202

You should look up the American rescue plan and reflect on why the government shut down recently.

green7ea · 2026-06-01T04:30:08 1780288208

It only reversed it for within the US, I learned that when the company I worked for (owner was a US company) closed.

trollbridge · 2026-06-01T18:23:45 1780338225

Right, you can only deduct R&D expenses that happen inside the U.S.

If you want to do R&D overseas, best to set up an overseas company.

gsky · 2026-06-01T11:09:23 1780312163

Trump takes credit for fixing problems that he created in the first place.

stult · 2026-05-28T10:12:20 1779963140

I think that study is the wrong framing of the problem for identifying economic returns on AI. We don't need AI to complete tasks perfectly, just to be able to generate a good enough approximation that is easy to review and correct such that an employee has to spend less time correcting AI's errors than they would spend producing the entire output from scratch. So it won't be a drop in replacement for an employee for another 4-10 years, but in the interim, will shift an employee's role from generating a complete solution to primarily reviewing and correcting an LLM-generated solution to get it from that 80-95% level (or whatever the starting point might be prior to 2029) to 100%.

At this point, the vast majority of the work required to make GenAI capable of producing that sufficiently reviewable/correctable content isn't improving model quality, but creating the harnesses, infrastructure, and workflows around the models. Companies aren't seeing returns yet because too many early adopting companies have conceived of AI as a drop in replacement for employees, or at least as a reason to cut staff immediately, without first building out the supporting systems needed to compensate for the inadequacies of the models.

stult · 2026-05-28T09:07:20 1779959240

That's an asinine comparison that completely ignores the underlying economic substance of the transaction. You can't pay yourself fees from a mortgage.

stult · 2026-05-28T05:32:38 1779946358

Weirdly the CIA actually does require case officers to get signed receipts from their assets for payments. Whether they verify the signatures is another question...

rdtsc · 2026-05-28T18:11:52 1779991912

Ha! That's interesting. I think it would be funny to read some of those. "I, Mr. Ivanov, the Russian spy, got: 2 bars of gold, 5 buckets of caviar, 10 cases of vodka and 3 kilos of cocaine". Then they call the CIA headquarters complaining the cocaine is stale and they would rather have another 5 buckets of caviar instead.

stult · 2026-05-25T17:33:01 1779730381

You're assuming the price won't come down as the tech matures. That seems like a big assumption, considering how quickly open weights models are catching up to frontier models, and how little effort has been invested so far in optimizing inference costs.

It's especially a crazy assumption to make relative to the costs of employing a human. The costs of paying an entry level employee are unlikely to go down at all, and even if those costs do decline, there's a floor they can't drop below (minimum wage at the extreme end), whereas companies are free to optimize agentic costs as close to zero as possible.

So you are assuming that a cost which is extremely susceptible to optimization but which no one has yet seriously attempted to minimize will remain perpetually above a cost which is much less susceptible to optimization, is already subject to enormous efforts to minimize, and has a legally mandated floor. That seems like a bad bet.