Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Also https://github.com/LostRuins/koboldcpp

The UI is relatively mature, as it predates llama. It includes upstream llama.cpp PRs, integrated AI horde support, lots of sampling tuning knobs, easy gpu/cpu offloading, and its basically dependency free.



yes. at first glance it looks like a windows app but it's actually very portable. it has some parameters for gpu offloading and extended context size that just work. it exposes an api endpoint. i use it on a workstation to serve larger llms locally and like the performance and ease of use.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: