Tangent: is anyone using a 7900 XTX for local inference/diffusion? I finally installed Linux on my gaming pc, and about 95% of the time it is just sitting off collecting dust. I would love to put this card to work in some capacity.
I've been using it for a few years on Gentoo. There were challenges with Python 2 years ago, but over the past year it's stabilized and I can even do img2video which is the most difficult local inference task so far.
Performance-wise, the 7900 xtx is still the most cost effective way of getting 24 gigabytes that isn't a sketchy VRAM mod. And VRAM is the main performance barrier since any LLM is going to barely fit in memory.
Highly suggest checking out TheRock. There's been a big rearchitecting of ROCm to improve the UX/quality.
I've only played with using 7900XT for locally hosting LLMs via ollama (this is on Windows, mind you) and things worked fine - e.g. devstral:24b was decently fast. I haven't had time to use it for anything even semi-serious though so cannot comment on how useful it actually is.
I bought one when they were pretty new and I had issues with rocm (iirc I was getting kernel oopses due to GPU OOMs) when running LLMs. It worked mostly fine with ComfyUI unless I tried to do especially esoteric stuff. From what I've heard lately though, it should work just fine.
I've done it with a 6800XT, which should be similar. It's a little trickier than with an Nvidia card (because everything is designed for CUDA) but doable.
For LLMs, I just pulled the latest llama.cpp and built it. Haven't had any issues with it. This was quite recently though, things used be a lot worse as I understand it.