Well, I run my own OpenWebUI with Ollama, installed with docker compose and running local on my home server with some NVIDIA GPU and I am pretty happy with the overall result.

I have only installed local open source models like gptoss, deepseek-r1, llama (3.2, 4), qwen3…

My use case is mostly ask questions on documentation for some development (details on programming language syntax and such).

I have been running it for months now, and it come to my mind that it would be useful for the following tasts as well:

  • audio transcribing (voice messages to text)
  • image generation (logos, small art for my games and such)

I fiddled a bit around, but got nowhere.

How do you do that from the openwebui web interface?

(I never used ollama directly, only through the openwebui GUI)

  • yellow [she/her]@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    1
    ·
    13 days ago

    The entire point of an MoE is that all of the experts aren’t activated on every single token. The only parts that are are (AFAIK) the expert router and the attention layers.