Everybody’s talking about Mistral, an upstart French challenger to OpenAI

daredevil@kbin.social · 11 months ago

Everybody’s talking about Mistral, an upstart French challenger to OpenAI

cheese_greater@lemmy.world · edit-2 11 months ago

I wonder if it can be coaxed to talk shit about L’académie…🤔 That would be absolutement l’hilarité

rigatti@lemmy.world · 11 months ago

I’m not talking about Mistral. Wait… crap.

Taleya@aussie.zone · 11 months ago

As an Australian, i’m a fan

TheMurphy@lemmy.world · 11 months ago

This is the inevitable future of AI. In a few years there will be an AI model for almost anything in the world created by various companies.

This tool is too powerful to ignore.

Praise Idleness@sh.itjust.works · 11 months ago

They already got the holy AI you sons of a silly person!

AutoTL;DR@lemmings.world · 11 months ago

This is the best summary I could come up with:

Mistral, based in Paris and founded by Arthur Mensch, Guillaume Lample, and Timothée Lacroix, has seen a rapid rise in the AI space recently.

It has been quickly raising venture capital to become a sort of French anti-OpenAI, championing smaller models with eye-catching performance.

Mistral claims that it outperforms Meta’s much larger LLaMA 2 70B (70 billion parameter) large language model and that it matches or exceeds OpenAI’s GPT-3.5 on certain benchmarks, as seen in the chart below.

The speed at which open-weights AI models have caught up with OpenAI’s top offering a year ago has taken many by surprise.

It feels like the capability / reasoning power has made major strides, lagging behind is more the UI/UX of the whole thing, maybe some tool use finetuning, maybe some RAG databases, etc."

In the case of Mixtral 8x7B, the name implies that the model is a mixture of eight 7 billion-parameter neural networks, but as Karpathy pointed out in a tweet, the name is slightly misleading because, "it is not all 7B params that are being 8x’d, only the FeedForward blocks in the Transformer are 8x’d, everything else stays the same.

The original article contains 705 words, the summary contains 190 words. Saved 73%. I’m a bot and I’m open source!

Ashyr@sh.itjust.works · edit-2 8 months ago

Removed by mod

joneskind@lemmy.world · edit-2 11 months ago

I run it fine on a base model MacBook Air with 8Gb of RAM and absolutely crazy on a 30 GPU cores M2 Max. Didn’t try on my company’s M1 Pro but I will tomorrow.

I use the LMStudio app and download Mistral from there. The heavier model for my beefy Mac and a 3Gb one for the Air. GPU acceleration with Metal enabled.

I tried a lot of models for development purposes and this one blew my mind.

cheese_greater@lemmy.world · edit-2 11 months ago

Seriously? Might have to try it

Can you, like, “have” or keep it?

joneskind@lemmy.world · 11 months ago

You download the model and it’s on your computer for as long as you want. The whole point is to be able to use it locally.

cheese_greater@lemmy.world · edit-2 11 months ago

So it is entirely local? Schweet! How large is it (3GB for Air or something?)

joneskind@lemmy.world · 11 months ago

So it is entirely local? Absolutely

How large is it? 12 models of quantization, from 3.08GB to 7.70GB

I use mistral-7b-instruct-v0.1.Q3_K_L.gguf 3.82GB on the MBA

Note that it might crash sometimes during computation. Just push the button “reload” then “continue” and the model finish its sentence as if nothing happened. I don’t know if its related to MLStudio (the app using the model) or the model itself though.

bioemerl@kbin.social · 11 months ago

Mixtral GPTQ can run on a 3090

Mistral 7b can run on most modern gpus

joneskind@lemmy.world · edit-2 11 months ago

Oh boy, I missed Mixtral GPTQ and only tried Mistral 7b

Currently downloading mixtral-8x7b-v0.1.Q4_K_M.gguf

Thank you!

EDIT: mixtral-8x7b-v0.1.Q4_K_M.gguf was to heavy for my Mac but mixtral-8x7b-v0.1.Q3_K_M.gguf runs fine AF

bioemerl@kbin.social · 11 months ago

Be warned, prompt processing is slow

joneskind@lemmy.world · 11 months ago

It is indeed. I’m switching to the instruct model to see if I can get better results for code and documentation.

daredevil@kbin.social · 11 months ago

I’m looking forward to the day where these tools will be more accessible, too. I’ve tried playing with some of these models in the past, but my setup can’t handle them yet.

joneskind@lemmy.world · 11 months ago

You should definitely try Mistral. It runs on a potato

daredevil@kbin.social · edit-2 11 months ago

I’ll give it a shot later today, thanks

edit: Tried out mistral-7b-instruct-v0.1.Q4_K_M.ggufvia the LM Studio app. it runs smoother than I expected – I get about 7-8 tokens/sec. I’ll definitely be playing around with this some more later.

GBU_28@lemm.ee · 11 months ago

Are you running llama.cpp and a gguf format of the model?

daredevil@kbin.social · 11 months ago

I believe I was when I tried it before, but it’s possible I may have misconfigured things

GBU_28@lemm.ee · 11 months ago

Have you checked out llama-cpp-python? The API is very simple, from the readme

daredevil@kbin.social · 11 months ago

I haven’t, but I’ll keep this in mind for the future – thanks.

RmDebArc_5@lemmy.ml · 11 months ago

deleted by creator

iopq@lemmy.world · 11 months ago

For this one, you should be able to run it on anything with 8GB of VRAM. That said, it may not be fast. You will probably want a Turing or newer card with as much VRAM bandwidth as possible.

daredevil@kbin.social · 11 months ago

That’s good to know. I do have 8GB VRAM, so maybe I’ll look into it eventually.

ichbinjasokreativ@lemmy.world · 7 months ago

Something like mistral-dolphin (4GB) and mixtral-dolphin (26GB) are running very smoothly on my 6900xt on rocm 6

Everybody’s talking about Mistral, an upstart French challenger to OpenAI

Everybody’s talking about Mistral, an upstart French challenger to OpenAI

Mixture of experts