Trying something new, going to pin this thread as a place for beginners to ask what may or may not be stupid questions, to encourage both the asking and answering.
Depending on activity level I’ll either make a new one once in awhile or I’ll just leave this one up forever to be a place to learn and ask.
When asking a question, try to make it clear what your current knowledge level is and where you may have gaps, should help people provide more useful concise answers!
I have two 3090 Turbo GPUs and it seems like oobabooga doesn’t split the load between the two cards when I try to run TheBloke/dolphin-2.7-mixtral-8x7b-AWQ.
Does anyone know how to make text generation webui use both cards? Do I need an nvlink between the two cards?
You shouldn’t need nvlink, I’m wondering if it’s something to do with AWQ since I know that exllamav2 and llama.cpp both support splitting in oobabooga
I think you’re right. Saw a post on Reddit basically mentioning the same things I’m seeing.
It looks like autoawq supports it but it might be an issue with how oobabooga implements it or something…