> I honestly still don't see the point of compaction.
Currently my mental model is every token Claude generates gets added to the context window. When it fills up there is no way forward. If you are going to get a meaningful amount of work done before the next compaction they have to delete most of the tokens in the context window. I agree after compaction it's like dealing with something that's developed a bad case of dementia, but you've run out what is the alternative?
> why would you even want compaction and not just start a blank sessions by reading that md?
If you look at "how to use Claude" instructions (even those from Anthropic), that's pretty much what they do. Subagents for example are Claude instances that start set of instructions and a clean context window to play with. The "art of using Claude" seems to be the "art of dividing a project into tasks, so every task gets done without it overflowing the context window".
This gives me an almost overwhelming sense of déjà vu. I've spent my entire life writing my code with some restriction in mind - like registers, RAM, lines of code in a function, size of PR's, functions in a API. Now the restriction is the size of the bloody context window.
> I'm working on something which tries to achieve lossless compaction but that is incredibly expensive and the process needs around 5 to 10 times as many tokens to compact as the conversation it is compacting.
I took a slightly different approach. I wanted a feel for what the limit was.
I was using Claude to do a clean room implementation of existing code. This entails asking Claude to read an existing code base, and produce a detailed specification of all of its externally observable behaviours. Then using that specification only (ie, without reference to the existing program, or a global CLAUDE.md, or any other prompts), it had to reliably produce a working version of the original in another language. Thus the specification had to include all the steps that are needed to do that - like unit tests, integration tests, coding standards instructions on running the compiler, and so on, that might normally come from elsewhere.
Before proceeding, I wanted to ensure Claude could actually do the task without overflowing its context window - so I asked Claude for some conservative limits. The answer was: a 10,000 word specification that generated 10,000 lines of code would be a comfortable fit. My task happened to fit, but it's tiny really.
When working with even a moderate code base, where you have CLAUDE.md, and a global CLAUDE.md for coding standards and what not and are using multiple modules in that code base so it has to read many lines of code, you run into that 10,000 words of prompt, 10,000 lines of code it has to read or write very quickly - within a couple of hours for me. And then the battle starts to split up the tasks, create sub-agents, yada-yada. In the end, they are all hacks for working around the limited size of the context window - because, as you say, compaction is about as successful for managing the context window as the OOM killer is for managing RAM.
There are Non-Resident Importers, which are foreign companies that import goods into the USA, but do not have a presence in the United States. About 15% of USA imports come through NRIs.
For them this reversal sets up a true irony. Trump effectively forced US citizens to pay more the imported goods. He thought that money would go to the USA treasury. Now the US treasury has to pay it back, so it is a free gift to the exporting countries. Like China.
There are lot of embedded SQL libraries out there. I'm not particularly enamoured with some of the design choices SQLite made, for example the "flexible" approach they take to naming column types, so that isn't why I use it.
I use it for one reason: it is the most reliable SQL implementation I know of. I can safely assume if file corruption, or invariants I tried to keep aren't there, it isn't SQLite. By completely eliminating one branch of the failure tree, it saves me time.
That one reason is the one thing this implementation lacks - while keeping what I consider SQLite's warts.
> Lets all arbitrarily agree AGI is here. I can't even be bothered discussing what the definition of AGI is.
There is a definition of AGI the AI companies are using to justify their valuation. It's not what most people would call AGI but it does that job well enough, and you will care when it arrives.
They define it as an AI that can develop other AI's faster than the best team of human engineers. Once they build one of those in house they outpace the competition and become the winner that takes all. Personally I think it's more likely they will all achieve it at a similar time. That would mean the the race will continues, accelerating as fast as they can build data centres and power plants to feed them.
It will impact everyone, because the already dizzying pace of the current advances will accelerate. I don't know about you, but I'm having trouble figuring out what my job will be next year as it is.
An AI that just develops other AI's could hardly be called "general" in my book, but my opinion doesn't count for much.
May I ask, what experiences are you personally having with LLMs right now that is leading you to the conclusion that they will become "intelligent" enough to identify, organise, and build advancing improvements to themselves, without any human interaction in the near future (1 - 2 years lets say)?
> May I ask, what experiences are you personally having with LLMs right now that is leading you to the conclusion that they will become "intelligent" enough to identify, organise, and build advancing improvements to themselves, without any human interaction in the near future (1 - 2 years lets say)?
None, as I don't develop LLM's.
I wasn't saying I think they will succeed, but I think it is worth noting their AGI ambitions are not as grand as the term implies. Nonetheless, if they achieve them, the world will change.
> Just because you were late to the party doesn't mean all of us were.
It wasn't a party I liked back in 2023. I'm just repeating the same stuff I see said over and over again here, but there has been a step change with Opus 4.5.
You can still it in action now because the other models are still where Opus was at a while ago. I recently needed to make small change to script I was using. It is a tiny (50 line) script written with the help of AI's ages ago, but was subtly wrong in so many ways. It's now become clear neither the AI's (I used several and cross checked) nor myself had a clue about what we were dealing with. The current "seems to work" version was created after much blood caused by misunderstandings was spilt, exposing bugs that had to be fixed.
I asked Claude 4.6 to fix yet another misunderstanding, and the result was a patch changing the minimum number of lines to get the job done. Just reviewing such a surgical modification was far easier than doing it myself.
I gave exactly the same prompt to Gemini. The result was a wholesale rearrangement of the code. Maybe it was good, but the effort to verify that was far lager than just doing it myself. It was a very 2023 experience.
The usual 2023 experience for me was ask an AI write some greenfield code, and get a result that looked like someone had changed variable names in something they found on the web after a brief search for code that looked like it might do a similar job. If you got lucky, it might have found something that was indeed very similar, but in my case that was rare. Asking it to modify code unlike something it had seen before was like asking someone to poke your eyes with a stick.
As I said, some of the organisers of this style of party seem have gotten their act together, so now it is well worth joining their parties. But this is a newish development.
Currently my mental model is every token Claude generates gets added to the context window. When it fills up there is no way forward. If you are going to get a meaningful amount of work done before the next compaction they have to delete most of the tokens in the context window. I agree after compaction it's like dealing with something that's developed a bad case of dementia, but you've run out what is the alternative?
> why would you even want compaction and not just start a blank sessions by reading that md?
If you look at "how to use Claude" instructions (even those from Anthropic), that's pretty much what they do. Subagents for example are Claude instances that start set of instructions and a clean context window to play with. The "art of using Claude" seems to be the "art of dividing a project into tasks, so every task gets done without it overflowing the context window".
This gives me an almost overwhelming sense of déjà vu. I've spent my entire life writing my code with some restriction in mind - like registers, RAM, lines of code in a function, size of PR's, functions in a API. Now the restriction is the size of the bloody context window.
> I'm working on something which tries to achieve lossless compaction but that is incredibly expensive and the process needs around 5 to 10 times as many tokens to compact as the conversation it is compacting.
I took a slightly different approach. I wanted a feel for what the limit was.
I was using Claude to do a clean room implementation of existing code. This entails asking Claude to read an existing code base, and produce a detailed specification of all of its externally observable behaviours. Then using that specification only (ie, without reference to the existing program, or a global CLAUDE.md, or any other prompts), it had to reliably produce a working version of the original in another language. Thus the specification had to include all the steps that are needed to do that - like unit tests, integration tests, coding standards instructions on running the compiler, and so on, that might normally come from elsewhere.
Before proceeding, I wanted to ensure Claude could actually do the task without overflowing its context window - so I asked Claude for some conservative limits. The answer was: a 10,000 word specification that generated 10,000 lines of code would be a comfortable fit. My task happened to fit, but it's tiny really.
When working with even a moderate code base, where you have CLAUDE.md, and a global CLAUDE.md for coding standards and what not and are using multiple modules in that code base so it has to read many lines of code, you run into that 10,000 words of prompt, 10,000 lines of code it has to read or write very quickly - within a couple of hours for me. And then the battle starts to split up the tasks, create sub-agents, yada-yada. In the end, they are all hacks for working around the limited size of the context window - because, as you say, compaction is about as successful for managing the context window as the OOM killer is for managing RAM.
reply