Oh yeah for sure, I’ve run Llama 3.2 on my RTX 4080 and it struggles but it’s not obnoxiously slow. I think they are betting more software will ship with integrated LLMs that run locally on users PCs instead of relying on cloud compute.
Data centres want the even beefier cards anyhow, but I think nVidia envisions everyone running local LLMs on their PCs because it will be integrated into software instead of relying on cloud compute. My RTX 4080 can struggle through Llama 3.2.
This is absolutely 3dfx level of screwing over consumers and all about just faking frames to get their “performance”.
They aren’t making graphics cards anymore, they’re making AI processors that happen to do graphics using AI.
What if I’m buying a graphics card to run Flux or an LLM locally. Aren’t these cards good for those use cases?
Oh yeah for sure, I’ve run Llama 3.2 on my RTX 4080 and it struggles but it’s not obnoxiously slow. I think they are betting more software will ship with integrated LLMs that run locally on users PCs instead of relying on cloud compute.
Welcome to the future
Except you cannot use them for AI commercially, or at least in data center setting.
Data centres want the even beefier cards anyhow, but I think nVidia envisions everyone running local LLMs on their PCs because it will be integrated into software instead of relying on cloud compute. My RTX 4080 can struggle through Llama 3.2.
“T-BUFFER! MOTION BLUR! External power supplies! Wait, why isn’t anyone buying this?”