Makes me wonder how much better these systems would be at online learning as opposed to systems that require separate training and inference. This is especially relevant in a dynamic environment for applications such as robotics.
Yeah, I don't think it's mentioned in the paper however in the New Scientist podcast it was alluded to by the lead author that there is currently work being done to compare. I'd reason that they are looking at wall clock time since ML systems can train in no time if done in a batch-accelerated process.
That's fascinating. It reminds me of this clip from Adam Curtis's All Watched Over By Machines of Loving Grace where they got lots of random people to play Pong in a theatre in the 90s:
So it looks in the paper they're using a theoretical framework called the Free Energy Principle developed by Prof Karl Friston at UCL which postulates that brains have evolved to minimise the amount of information surprise between their own internal generative model of the world and the observable world.
The paper uses this as the basis of learning and it looks like the neurons are given a random (electrical) stimulus when they miss the ball and a predictable stimulus when they hit it.