Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

With the current 24b LLM model it's 24 GB. I have no clue how far down you can go with the GPU is using smaller models, you can set the model in server.py. Quite sure 16 GB will work but at some point it will probably fail.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: