Best LLMs for PC/Tech Troubleshooting?

FrankLaskey@lemmy.ml · 1 day ago

Best LLMs for PC/Tech Troubleshooting?

rozodru@piefed.social · 1 day ago

Claude is fine but you REALLY have to hold its hand in order to get any sort of decent solution. if you don’t 9 times out of 10 it’ll just make something up based purely on a forum or repo post. you have to tell it to provide sources so as to prevent the usual BS it’ll spew out.

I found once you hold it’s hand and scold it a few times it will provide decent solutions but then by that point you’ve essentially turned it into a fancy search engine.

FrankLaskey@lemmy.ml · 1 day ago

Appreciate you sharing your experience. With this being the case and it being an order of magnitude more $$$ than Qwen3 coder, I think I’ll mostly steer clear for now. Not sure why this model seems to have such mindshare and dominance with programmers these days honestly. Other than many in the west seem somewhat biased against Chinese models.

rozodru@piefed.social · 1 day ago

Mainly because of Claude Code.

CC is better than the web based Claude especially when it comes to actual coding since it’s embedded with whatever project you’re working on.

Claude really excels when it’s right in the thick of it with you. Thus, again, you REALLY have to hold it’s hand. I personally don’t think it’s as great as others make it out to be.

brucethemoose@lemmy.world · 1 day ago

What you really want is a locally hostable ‘researching’ front end that gets the LLM to go out and search the web for documentation. Without good context; they’re always ‘guessing’

I’m a little bit behind on these actually. But I do know Open Web UI’s research plugin has a bad reputation.

FrankLaskey@lemmy.ml · 1 day ago

I definitely have been looking out for this for a while. Wanting to replicate GPT deep research but not seeing a great way to do this. I did see that there was a OWUI tool for this but it didn’t seem particularly battle-tested so I hadn’t checked it out yet. I’ve been curious about how the new Tongyi Deep Research might be…

That said, specifically for troubleshooting somewhat esoteric (or at least quite bespoke in terms of configuration) software problems, I was hoping the larger coder focused models would have enough built-in knowledge to suss out the issues. Maybe I should be having them consistently augment their responses with web searches if this isn’t the case? I have not been clicking that button typically.

I do generally try to paste in or link as much of the documentation for whatever software I’m troubleshooting though.

brucethemoose@lemmy.world · edit-2 7 hours ago

Prompt formatting (and the system prompt) is a huge thing, especially with models trained for ‘tool use’ a specific way, so be sure to keep that in mind. For example, if you want a long chain of steps, be sure to explicitly ask (though Qwen is uses its thinking block quite gratuitously).

I find GLM 4.5’s default formatting to be really good though: be sure to give that a shot. It’s also awesome because the full 350B model (with some degredation) is locally runnable on a 128GB RAM + 24GB VRAM gaming rig, and the ‘Air’ version is quite fast and accurate on lesser hardware.

Local hosting, if you can swing it, is particularly nice because the calls are literally free, and promt ingestion is cached, so you can batch them and spam the heck out of them for testing and such.

FrankLaskey@lemmy.ml · 5 hours ago

Yes, I do local host several models. Mostly the Qwen3 family stuff like 30b a3b etc. Have been trying GLM 4.5 a bit through OpenRouter and I’ve been liking the style pretty well. Interesting to know I could just pop in some larger RAM dimms potentially and run even larger models locally. The thing is OR is so cheap for many of these models and with zero data retention policies I feel a bit stupid for even buying a 24 GB VRAM GPU to begin with.

brucethemoose@lemmy.world · 4 hours ago

Yeah, the APIs are super cheap. It doesn’t make a ton of sense unless you already have the GPU lying around.

With the right settings, GLM will actually work fine in 16GB, 12GB, or even 11GB VRAM + 128GB RAM. I can even make a custom quant if you want, since I already got that set up. 24 GB just gives it a bit of ‘breathing room’ for longer context and relaxed quantization for the dense parts of the model.

GLM Air will work on basically any modernish Nvidia GPU + like 26GB of free RAM. Its dense part is really small.

But to be clear, you have to get into the weeds to run them efficiently this way. There’s no simple ollama run here.

mierdabird@lemmy.dbzer0.com · 23 hours ago

I’m surprised you’re getting disappointing results with Qwen 3 Coder 480b. I run Qwen 2.5 coder 14b locally (Open WebUI + Ollama) on my 3060 12gb and I’ve been pretty pleased with it’s answers so far relating to python code, Django documentation/settings, and quirks with my reverse proxy.

I assume you aren’t hosting the 480b locally right? Are you using Open WebUI and an Open API key?

FrankLaskey@lemmy.ml · 20 hours ago

Honestly it has been good enough until recently when I’ve been struggling specifically with docker networking stuff and it’s been on the struggle bus with that. Yes, I’m using OpenRouter via OpenWebUI. I used to run a lot of stuff locally (mostly 4-b it quant 32b and smaller since I only have a single 3090) but lately I’ve been trying more larger models out on OpenRouter since many of the non proprietary ones are super cheap. Like fractions of a penny for a response… Many are totally free to a point as well.

hendrik@palaver.p3x.de · edit-2 1 day ago

I lately tried ChatGPT for some networking stuff. And occasionally I’ll use AIstudio (Google) for similar things. And let’s say they’re all not great. They can do the relatively common (and somewhat easy) Linux stuff, I think they should be able to tell you how to manage your Docker containers and volumes at the command line. But I had GhatGPT massively struggle with networking. And like SystemD service files had problematic stuff in them… So, my local LLMs are way to tiny to try. But there might just not be any properly good AI out there as of today. And their “reasoning” modes aren’t like human reasoning or systematic approaches either. They just make up a lot of stuff and that makes them a bit better, it’s not logic though. What I end up doing is either fall back to my own brain, learn the stuff and do it myself. Or something alike “vibe-coding”… Ask it 10-20 times, scold it, put in the error messages and eventually I’ll get something that runs.

Btw, there’s still a human Linux community around. So maybe find your favorite Linux forum and ask there once it gets too complicated for AI.

afk_strats@lemmy.world · 1 day ago

Qwen 3 or Qwen 3 Coder? Qwen3 comes in a 235B, 30B and smaller sizes. Qwen 3 Coder comes in a 30B or 480B size.

Open Router has multiple quant options and, for coding, I’d try to only use 8bit int or higher.

Claude also has a ton of sizes and deployment options with different capabilities.

As far as reasoning, the newest Deepseek V3.1 Terminus should be pretty good.

Honestly, all of these models should be able to help you up to a certain level with docker. I would double check how you connect to open router, making sure your hyperparams are good, making sure thinking/reasoning is enabled. Maybe try duck.ai and see if the models there are matching up to whatever you’re doing in open router.

Finally, not being a hater, but LLMs are not intelligent. They cannot actually reason or think. They can probabilistically align with answers you want to see. Sometimes your issue might be too weird or new for them to be able to give you a good answer. Even today models will give you docker compose files with a version number at the top, a feature which has been deprecated for over a year.

Edit: gpt-oss 120 should be cheap and capable enough. Available on duck.ai

FrankLaskey@lemmy.ml · 1 day ago

The coder model (480B). I initially mistakenly said the 235b one but edited that. I didn’t know you could customize quant on OpenRouter (and I thought the differences between most modern 4 bit quants and 8-bit was minimal as well…) I have tried GPT OSS 120 a bunch of time and though it seems quote unquote ‘intelligent’ enough it is just too talkative and verbose for me (plus I can’t remember the last time it responded without somehow working an elaborate comparison table into the response) and it makes it too hard to parse through things.

afk_strats@lemmy.world · 21 hours ago

Totally. I think OSS is outright annoying with its verbosity. A system prompt will get around that

FrankLaskey@lemmy.ml · 20 hours ago

I tried that! I literally told it to be concise and to limit its response to a certain number of words unless strictly necessary and it seemed to completely ignore both.

Grimy@lemmy.world · 1 day ago

I think it’s more important how you run it.

I have copilot in vscode and since I use it to ssh into things, the bot has access to all the files and my terminal output. It’s also easy to switch from one model to an other.

FrankLaskey@lemmy.ml · 1 day ago

Even if this isn’t going to solve the issue of the quality of the LLM’s advice and help, it would massively simplify my current workflow which is copy/pasting logs and command responses and everything into the OWUI window. I’ll check it out. Can you use OpenRouter with VSCode to have access to more models or?

Grimy@lemmy.world · 1 day ago

Yup, open router is one of the options as well as ollama and all the major APIs.

I pay 10$ a month so I get unlimited chatgpt4.1, 5 mini and grok. I also have openai and gemini through API.

Surprisingly, grok feels the best because it tends to make small changes at a time and will verify by running your scripts if you let it. It picks up on it’s own mistakes way more often, and it’s also fast. Not the smartest but definitely the funnest.

You can probably get similar behavior by modifying the prompts for the other ones.

FrankLaskey@lemmy.ml · 20 hours ago

Is this Grok Code fast 1? I’ve noticed it’s hitting tops on OR for programming as of recently. I was going to try it out but it won’t respect my zero data retention preference unsurprisingly.

Grimy@lemmy.world · 19 hours ago

Yup, precisely. It’s easily the best free model on the 10$ a month plan.