Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yep, the Attention mechanism in the Transformer arch is pretty good.

Probably need another cycle of similar breakthrough in model engineering before this more complex neural network gets a step function better.

Moar data ain’t gonna help. The human brain is the proof: it doesnt need the internet’s worth of data to become good (nor all that much energy).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: