Tangent: is anyone using a 7900 XTX for local inference/diffusion? I finally ins...

jjmarr · 2025-12-08T20:55:03 1765227303

I've been using it for a few years on Gentoo. There were challenges with Python 2 years ago, but over the past year it's stabilized and I can even do img2video which is the most difficult local inference task so far.

Performance-wise, the 7900 xtx is still the most cost effective way of getting 24 gigabytes that isn't a sketchy VRAM mod. And VRAM is the main performance barrier since any LLM is going to barely fit in memory.

Highly suggest checking out TheRock. There's been a big rearchitecting of ROCm to improve the UX/quality.

androiddrew · 2025-12-09T03:35:56 1765251356

Bought a Radeon r9700. 32GB vram and it does a good job.

bialpio · 2025-12-09T08:14:16 1765268056

I've only played with using 7900XT for locally hosting LLMs via ollama (this is on Windows, mind you) and things worked fine - e.g. devstral:24b was decently fast. I haven't had time to use it for anything even semi-serious though so cannot comment on how useful it actually is.

Gracana · 2025-12-08T20:16:58 1765225018

I bought one when they were pretty new and I had issues with rocm (iirc I was getting kernel oopses due to GPU OOMs) when running LLMs. It worked mostly fine with ComfyUI unless I tried to do especially esoteric stuff. From what I've heard lately though, it should work just fine.

qskousen · 2025-12-08T19:01:36 1765220496

I've done it with a 6800XT, which should be similar. It's a little trickier than with an Nvidia card (because everything is designed for CUDA) but doable.

Joona · 2025-12-08T19:10:37 1765221037

I tested some image and text generation models, and generally things just worked after replacing the default torch libraries with AMD's rocm variants.

FuriouslyAdrift · 2025-12-08T19:02:18 1765220538

You'd be much better off wiht any decent nVidia against the 7900 series.

AMD doesn't have a unified architecture across GPU and compute like nVidia.

AMD compute cards are sold under the Insinct line and are vastly more powerfull than their GPUs.

Supposedly, they are moving back to a unified architecture in the next generation of GPU cards.

shmerl · 2025-12-08T23:12:20 1765235540

tinygrad disagrees.

aystatic · 2025-12-09T00:40:48 1765240848

name 3 things using tinygrad that's not openpilot

veddan · 2025-12-08T21:36:00 1765229760

For LLMs, I just pulled the latest llama.cpp and built it. Haven't had any issues with it. This was quite recently though, things used be a lot worse as I understand it.

universa1 · 2025-12-08T19:46:37 1765223197

try it with ramalama[1]. worked fine here with a 7840u and a 6900xt.

[1] https://ramalama.ai/