More

Moosdijk · 2025-12-22T22:08:17 1766441297

Interesting. Instead of running the model once (flash) or multiple times (thinking/pro) in its entirety, this approach seems to apply the same principle within one run, looping back internally.

Instead of big models that “brute force” the right answer by knowing a lot of possible outcomes, this model seems to come to results with less knowledge but more wisdom.

Kind of like having a database of most possible frames in a video game and blending between them instead of rendering the scene.

omneity · 2025-12-22T23:50:31 1766447431

Isn’t this in a sense an RNN built out of a slice of an LLM? Which if true means it might have the same drawbacks, namely slowness to train but also benefits such as an endless context window (in theory)

ctoa · 2025-12-23T01:30:44 1766453444

It's sort of an RNN, but it's also basically a transformer with shared layer weights. Each step is equivalent to one transformer layer, the computation for n steps is the same as the computation for a transformer with n layers.

The notion of context window applies to the sequence, it doesn't really affect that, each iteration sees and attends over the whole sequence.

omneity · 2025-12-23T12:51:46 1766494306

Thanks, this was helpful! Reading the seminal paper[0] on Universal Transformers also gave some insights:

> UTs combine the parallelizability and global receptive field of feed-forward sequence models like the Transformer with the recurrent inductive bias of RNNs.

Very interesting, it seems to be an “old” architecture that is only now being leveraged to a promising extent. Curious what made it an active area (with the works of Samsung and Sapient and now this one), perhaps diminishing returns on regular transformers?

0: https://arxiv.org/abs/1807.03819

nl · 2025-12-23T02:03:38 1766455418

> Instead of running the model once (flash) or multiple times (thinking/pro) in its entirety

I'm not sure what you mean here, but there isn't a difference in the number of times a model runs during inference.

Moosdijk · 2025-12-23T20:32:08 1766521928

I meant going to the likeliest output (flash) or (iteratively) generating multiple outputs and (iteratively) choosing the best one (thinking/pro)

nl · 2025-12-23T23:57:14 1766534234

That's not how these models work.

Thinking models produce thinking tokens to reason out the answer.

Moosdijk · 2025-12-20T10:58:37 1766228317

RMS = Richard Stallman, responsible for the GNU project and the free software foundation.

He had a page dedicated to his housing situation:

https://stallman.org/seeking-housing.html

LoganDark · 2025-12-20T13:10:39 1766236239

https://web.archive.org/web/20190928065654/https://stallman....

Moosdijk · 2025-12-14T21:09:31 1765746571

Do you have a log available somewhere?

robviren · 2025-12-18T18:16:42 1766081802

I keep everything in my self hosted gitea. Just made it public.

https://gitter.swolereport.com/robviren/cspace

Moosdijk · 2025-12-20T14:42:42 1766241762

Thanks, I’ll check it out

Edit: timed out

iFire · 2025-12-15T05:20:21 1765776021

Reminds me of https://github.com/RobViren/kvoicewalk where people take voice clips and train a text to speech using random walks.

Not related, misguided methods :D

Moosdijk · 2025-12-16T18:55:59 1765911359

Well, it’s the same author so it is kind of related.

Moosdijk · 2025-12-14T21:08:10 1765746490

I’m in this one because it was at the top of the front page.

Moosdijk · 2025-12-01T09:29:12 1764581352

https://youtu.be/IAuapNwJ2vQ?si=E332G7AhFfxDIcSx

Moosdijk · 2025-11-20T18:19:05 1763662745

I never really looked at it that way, but I think you're right. Although, non-European-owned companies aren't necessarily incentivized to look towards European companies. Looking towards your European neighbors mostly comes down to logistical situations. In those sectors, multilingual services are more common.

Moosdijk · 2025-10-23T20:08:08 1761250088

I'm hoping you'll open the API some time in the future. This would be great for diy installations with a esp32 hub.

Moosdijk · 2025-10-12T20:19:37 1760300377

No issues here on iPhone 12 running iOS 18.6.2 and Firefox 143.2 (62218)

Moosdijk · 2025-10-12T20:21:13 1760300473

The orbiting sensitivity is a bit high when zoomed in a lot, which can lead to the model spinning out of control, as the other user mentioned.

Still manageable though, just very sensitive.

Moosdijk · 2025-10-08T13:36:41 1759930601

>Here's a hot take: Name and Shame.

That's easier said than done, hence why Stefano probably didn't.

Moosdijk · 2025-09-26T08:45:12 1758876312

My experience with Gemini is the sole reason I am convinced that there's an AI hype going on. It consistently hallucinates key information which has led me to spend countless hours tracking down which information the output was based on, only to find that it dreamt up the facts that it gave to me.

The way I have come to perceive AI is that it's mostly good at reassuring/reaffirming people's beliefs and ideas than an actual source of truth.

That would not be an issue if it was actually marketed as such, but seeing the "guided learning" function fail time and again makes me think we should be a lot more critical of what we're being told by tech enthusiasts/companies about AI.