>> The original network trained for 3 days on a SUN-4/260 workstation. This is e... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		phkahler on Aug 26, 2023 \| parent \| context \| favorite \| on: Deep Neural Nets: 33 years ago and 33 years from n... >> The original network trained for 3 days on a SUN-4/260 workstation. This is exactly why I didn't start experimenting with this stuff back then. I read some articles and had the interest, but having no access to existing training data or "fast" computers was really a show stopper. This article really convinced me that the amazing results today are mostly due to hardware advances. I will add my own view that 1) hardware will not be advancing anywhere near so much in the future. And 2) training and inference have to be done together like real brains do. Then the AI will learn from experience while deployed and you can clone the best ones later.

0xDEF on Aug 26, 2023 [–]

>This article really convinced me that the amazing results today are mostly due to hardware advances.

For LLMs that is true. But many other things like Whisper, Stable Diffusion etc. could in theory have been made a decade earlier.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact