More

fentonc · 2025-10-28T22:40:11 1761691211

I built a whimsical LLM-driven robot to provide running commentary for my yard: https://www.chrisfenton.com/meet-grasso-the-yard-robot/

fentonc · 2025-10-26T18:15:23 1761502523

I was an architect on the Anton 2 and 3 machines - the systolic arrays that computed pairwise interactions were a significant component of the chips, but there were also an enormous number of fairly normal looking general-purpose (32-bit / 4-way SIMD) processor cores that we just programmed with C++.

fentonc · 2025-10-01T22:46:27 1759358787

I love stuff like this! I have a Kaypro 2/84, which supports a display resolution of 160x100 with colors ranging from green, to 'less green' to black, and I went down a rabbit hole of trying to push the graphics: https://www.chrisfenton.com/exploring-kaypro-video-performan... - I was eventually able to get it do display (short!) 50 fps video clips.

fentonc · 2025-09-27T21:48:20 1759009700

I built a more whimsical version of this - my daughter and I basically built a 'junk robot' from a 1980s movie, told it 'you're an independent and free junk robot living in a yard', and let it go: https://www.chrisfenton.com/meet-grasso-the-yard-robot/

I did this like 18 months ago, so it uses a webcam + multimodal LLM to figure out what it's looking at, it has a motor in its base to let it look back and forth, and it use a python wrapper around another LLM as its 'brain'. It worked pretty well!

Neywiny · 2025-09-28T00:52:14 1759020734

Your article mentioned taking 4 minutes to process a frame. Considering how many image recognition softwares run in real time, I find this surprising. I haven't used them so maybe I'm not understanding, but wouldn't things like yolo be more apt to this?

_ea1k · 2025-09-28T04:10:28 1759032628

It uses an Intel N100, which is an extremely slow CPU. The model sizes that he's using would be pretty slow on a CPU like that. Moving up to something like the AMD AI Max 365 would make a huge difference, but would also cost hundreds of dollars more than his current setup.

Running something much simpler that only did bounding box detection or segmentation would be much cheaper, but he's running fairly full featured LLMs.

Neywiny · 2025-09-28T11:46:07 1759059967

Yeah I guess I was more thinking of moving to a bounding box only model. If it's OCRing it's doing too much IMO (though OCR could also be interesting to run). Not my circus not my monkeys but it feels like the wrong way to determine roughly what the camera sees.

jacquesm · 2025-09-27T23:16:58 1759015018

Coolest project on HN in a long time, really, wow, so much potential here.

procinct · 2025-09-28T00:04:15 1759017855

Thanks so much for sharing, that was a fun read.

theGnuMe · 2025-09-27T22:23:16 1759011796

This is cool!

fentonc · 2025-08-04T17:30:54 1754328654

Awesome project! I built a somewhat similar 30-pixel display: https://www.chrisfenton.com/the-pixelweaver/

Mine was entirely mechanical (driven by punch cards and a hand-crank), and changed all of the pixels in parallel, but a lot of the mechanism development looked extremely familiar to me.

benholmen · 2025-08-04T17:45:57 1754329557

This is incredible! I can appreciate how much work it took to make this happen. Well done!

I was recently in the presence of some linotype machines from the 1800s and it's so good to be humbled by the achievements of people who came before us. That machine was so complex, I could barely begin to figure out how to manufacture one. Your discussion of looms reminds me of that!

knome · 2025-08-05T00:54:18 1754355258

If you enjoy linotype machines, I'll suggest you watch 'Farewell ETAOIN SHRDLU', a documentary on the last night the New York Times ran its hot press system

fentonc · 2025-06-15T22:30:51 1750026651

I think a quad-CPU X-MP is probably the first computer that could have run (not train!) a reasonably impressive LLM if you could magically transport one back in time. It supported a 4GB (512 MWord) SRAM-based "Solid State Drive" with a supported transfer bandwidth of 2 GB/s, and about 800 MFLOPS CPU performance on something like a big matmul. You could probably run a 7B parameter model with 4-bit quantization on it with careful programming, and get a token every couple seconds.

rahen · 2025-06-17T22:30:06 1750199406

This sounds plausible and fascinating. Let’s see what it would have taken to train a model as well.

Given an estimate of 6 FLOPs per token per parameter, training a 7B parameter model would require about 1.26×10^22 FLOPs. That translates to roughly 500 000 years on an 800 MFLOPS X-MP, far too long to be feasible. Training a 100M parameter model would still take nearly 70 years.

However, a 7M-parameter model would only have required about six months of training, and a 14M one about a year, so let’s settle on 10 million. That’s already far more reasonable than the 300K model I mentioned earlier.

Moreover, a 10M parameter model would have been far from useless. It could have performed decent summarization, categorization, basic code autocompletion, and even powered a simple chatbot with a short context, all that in 1984, which would have been pure sci-fi back in those days. And pretty snappy too, maybe around 10 tokens per second if not a little more.

Too bad we lacked the datasets and the concepts...

fentonc · 2025-06-12T01:36:53 1749692213

The Cray PVP line was also doing double precision floating point, and could overlap vector memory operations with math operations. My guess is that you would need a microcontroller operating at several hundred MHz to beat a Cray-1 in practice. The later Cray-1/S and /M variants also supported a 10gbps link to an SSD of several hundred megabytes, which is hard to beat in a microcontroller.

fentonc · 2025-06-10T17:01:46 1749574906

Fun article! I was one of the architects on Anton 2 and Anton 3 at DESRES.

fentonc · on Jan 27, 2025

This is awesome! I recovered the only copy of COS (http://www.chrisfenton.com/cos-recovery/), but we never really had a way to use it.

fentonc · on Dec 27, 2024

I took a different approach by just making an FPGA-based multi-core Z80 setup. One core is dedicated to running 'supervisor' CP/NET server, and all of the applications run on CP/NET clients and can run normal CP/M software. I built a 16-core version of this, and each CPU gets its own dedicated 'terminal' window, with all of the windowing handled by the display hardware (and ultimately controlled by the supervisor CPU). It's a fun 'what-if' architecture that works way better than one might expect in practice. It would have made an amazing mid-to-late 1980s machine.