The only lesson I'm taking away is that we are still very early in the AI era. AI workflows look entirely different today than they did 18 months ago and I wouldn't bet on them looking the same in 18 months from now.
I had Claude walk me through getting local LLM models running on my Mac a month or two ago and so far as I can tell it was intentionally helpful. I even stated the reason was to have an uncensored model for myself and it had no objection. Long story short LM Studio running a Heretic Gemma 4 is doing just fine on my system now.
I've had the same bad luck with tool-calling on Gemma4. Looking around the web, we are not alone. For other tasks, it's seemingly quite quick and decent.
But it gets stuck in tool call loops, it seems like.
Oh to be clear I don't think Gemma 4 is suitable for real work. It runs at 10 tps and is somewhere between 4o and o1 in quality according to my subjective judgement. But Claude was happy to correctly tell me how to get it running and how to solve the pitfalls I encountered in that process.
The idea is that people who have politics like yours can be “visited” by the police and asked to “voluntarily” come down to the station for an interview about “hateful rhetoric” on social media. Doesn’t matter how you vote if actual political opposition is outlawed, which is where the UK is heading rapidly aided by digital surveillance.
It wouldn't surprise me if reverse engineering is put on the "highly unsafe" list in the near future in the same category as bio because of these interests. Can't have the cattle classes be able to control their own property now can we?
This is pretty much a given anyway. Making reverse engineering tools is already likely to get you sued by someone so model makers are apt to slow down the ability of their tools to reverse engineer to avoid the lawsuits themselves.
How is this any different than Microsoft? I suspect all of the big four use AD and Windows in their enterprise yet that isn’t a dealbreaker for auditing MS’ financials.
Neither Active Directory nor the Windows desktop operating system are a primary factor in accounting with respect to a bigcorp. They can have some secondary compliance-type effects on e.g. network backups and policy enforcement, but are not a primary threat to GAAP eligibility for the S&P500 like generative AI is.
I was trying to use Opus 4.6 in Claude Code to add some functionality to python code intended to run on a cluster and it didn't have any python environment in its remote environment. It needed to look at the schema of a parquet file to make sure it did things right and couldn't figure out how to do so with code because for god knows what reason there is no python environment in the dev environment for code intended to be run on a compute cluster in Python. Eventually it decided to just examine the raw binary bytes of the header, and then wrote perfectly functional code based on that.
On a different note I recently uploaded several thousand scraped IPO prospectuses to the gpt 5.4 mini API to parse and extract certain data. I ordered it in the system prompt to respond exactly with a specified JSON schema. When I got the results back and processed them there was not a single JSON parse error whatsoever. The model didn't have a single hallucination that created malformed JSON or JSON not matching the given schema across several hundred million input tokens and several million output tokens. And this was 5.4 Mini!
reply