The UI is relatively mature, as it predates llama. It includes upstream llama.cpp PRs, integrated AI horde support, lots of sampling tuning knobs, easy gpu/cpu offloading, and its basically dependency free.
yes. at first glance it looks like a windows app but it's actually very portable. it has some parameters for gpu offloading and extended context size that just work. it exposes an api endpoint. i use it on a workstation to serve larger llms locally and like the performance and ease of use.
The UI is relatively mature, as it predates llama. It includes upstream llama.cpp PRs, integrated AI horde support, lots of sampling tuning knobs, easy gpu/cpu offloading, and its basically dependency free.