We do need "hard effortful careful work" to keep planes flying, electrical grids running and medical devices safe. It's very relevant but very undervalued by our current economy.
I'm just suggesting eliminate (or weaken) the distinction between layers and expert and have just the one, then iterate that one until its 'gpod enough' score plus (iterationcount*spontaneity) is greater than some threshold.
> I replicated David Ng's RYS method [...] found something I didn't expect.
> Transformers appear to have discrete "reasoning circuits" — contiguous blocks of 3-4 layers that act as indivisible cognitive units. Duplicate the right block and the model runs its reasoning pipeline twice. No weights change. No training. The model just thinks longer.
How did you not expect that if you read his post? That's literally what he discovered, two years ago.
> The weird part: different duplication patterns create different cognitive "modes" from the same weights. Double-pass boosts math. Triple-pass boosts emotional reasoning. Interleaved doubling (13,13,14,14,15,15,16) creates a pure math specialist. Same model, same VRAM, different routing.
As far as I can see that's not implied by the original post.
But that's beside the point: quoting the bit where the poster says "here's what I'm building on top of" and using that to imply they haven't done anything new is a bit pointless, no?
You're right that my quote was misleading, I overlooked "the weird part" in the post because it didn't seem new to me either.
Here's the section in the original post that covers it: https://dnhkng.github.io/posts/rys/#the-brain-scanner All heatmaps are split by tasks and show an optimal point for each. The resulting routing he chose is a trade-off for both tasks, there isn't much else to do unless you intend to train a router anyway.
> So the ‘math organ’ has boundaries on both sides. Too few layers and you get nothing — you’ve cut into the circuit and it can’t complete its operation. Too many layers and you also get nothing — you’ve included tissue from a neighbouring circuit that doesn’t belong. Pre-training carved these structures out of the layer stack, and they only work whole. It also doesn’t translate to other tasks, as the heatmap for EQ scores doesn’t have this patch.
This is stated in the original post as well, under "The Beginning of LLM Neuroanatomy?" section:
> From end-position 43 to 46, we then see solid boosts in math scores (red = good, yay). But include layer 46 or beyond, and the benefits collapse again. The hypothesis: position 47 is where a different circuit begins. Including even one step of the next recipe messes up the current recipe.
> So the ‘math organ’ has boundaries on both sides. Too few layers and you get nothing — you’ve cut into the circuit and it can’t complete its operation. Too many layers and you also get nothing — you’ve included tissue from a neighbouring circuit that doesn’t belong. Pre-training carved these structures out of the layer stack, and they only work whole. It also doesn’t translate to other tasks, as the heatmap for EQ scores doesn’t have this patch.
> This is a much more specific claim than “middle layers do reasoning.” It’s saying the reasoning cortex is organised into functional circuits: coherent multi-layer units that perform complete cognitive operations. Each circuit is an indivisible processing unit, and the sweeps seen in the heatmap is essentially discovering the boundaries of these circuits.
You may be implying it but you also need to make sure this new housing goes into the long term rental market, instead of being secondary residences or airbnbs. I've seen it happen first hand in my home town. That may not be a problem for Austin though.
I can't find the French competition, ironic given that the front page of ukroc is full of French gov officials... Could you share the name if you have it?
The French and Japanese competitions seems to be rather low-key compared to the UK and US competitions. The winner of the French schools is mentioned here:
People are emigrating to the U.S because of decades of soft power and propaganda, and mostly to make money to send to their families or head back after a couple years.
On all the metrics that actually matter to quality of life (ie. not sqm of mowed grass per person or avg height of SUV bonnet), the EU rates higher than the US.
If people cared about mere subsistence, they wouldn't move to the U.S. They like everything that comes with greater income. You can't divorce that from metrics tracking quality of life.
Europe has a better safety net, but basically anywhere in the West is an improvement over their origin countries for the most part. And consider: the first choice for those interested in North America is not Canada, it's the U.S. The earning potential is higher, and immigrants work hard. They mostly don't care that there's a lesser social safety net.
It says "workforce": workforce (noun): the _people_ engaged in or available for work, either in a country or area or in a particular company or industry.
reply