wow - this is really well made! i've been doing research w/ Transformer-based au...

roadside_picnic · 2025-08-26T01:32:46 1756171966

> Attention as a concept itself is already quite unintuitive

Once you realize that Attention is really just a re-framing of Kernel Smoothing it becomes wildly more intuitive [0]. It also allows you to view Transformers as basically learning a bunch of stacked Kernels which leaves them in a surprisingly close neighborhood to Gaussian Processes.

0. http://bactra.org/notebooks/nn-attention-and-transformers.ht...

tough · 2025-08-26T03:54:27 1756180467

Nice read

> I'd be grateful for any pointers to an example where system developers (or someone else in a position to know) have verified the success of a prompt extraction.

You can try this yourself with any open source llm setup that lets you provide a system prompt no? Just give it a prompt, ask the model the prompt ,and see if it matches.

gpt-oss is trained to refuse so it wont share (you can provide system prompt on lmstudio)

adityamwagh · 2025-08-26T00:55:04 1756169704

It’s a very popular article that has been around for a long time!

gdiamos · 2025-08-26T01:23:23 1756171403

It's so good it is worth revisiting often