Hacker Newsnew | past | comments | ask | show | jobs | submit | andy12_'s submissionslogin
1.From Memorization to Reasoning in the Spectrum of Loss Curvature (arxiv.org)
65 points by andy12_ 34 days ago | past | 14 comments
2.Concrete "battery" developed at MIT now packs 10 times the power (news.mit.edu)
3 points by andy12_ 69 days ago | past | 2 comments
3.Gauss, an Agent for Autoformalization (math.inc)
6 points by andy12_ 3 months ago | past
4.Spurious Rewards: Rethinking Training Signals in RLVR (rethink-rlvr.notion.site)
1 point by andy12_ 6 months ago | past
5.VR-CLI: Learning to Reason for Long-Form Story Generation (arxiv.org)
2 points by andy12_ 7 months ago | past
6.Tokenformer: Rethinking transformer scaling with tokenized model parameters (arxiv.org)
3 points by andy12_ on Oct 31, 2024 | past | 1 comment
7.Selective Attention Improves Transformer (arxiv.org)
1 point by andy12_ on Oct 7, 2024 | past | 1 comment
8.The AdEMAMix Optimizer: Better, Faster, Older (arxiv.org)
2 points by andy12_ on Sept 10, 2024 | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: