Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Even in tensor parallel modes? I thought it could only work if you're fine stalling all but n GPU for n users at any given moments.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: