There’s a big difference between training a model, running a model, and running a model at scale.
A small, self hosted setup will have lower accuracy and queries per second, and it will have a cost, but the cost will be no more than playing a videogame. You’ll still have something surprisingly accurate and responsive for some tasks, like being a wiki interface or something.
Remember that some of these models can run on a standard smartphone, and all the hoopla when people found that chrome was downloading models onto people’s devices.
There’s a big difference between training a model, running a model, and running a model at scale.
A small, self hosted setup will have lower accuracy and queries per second, and it will have a cost, but the cost will be no more than playing a videogame. You’ll still have something surprisingly accurate and responsive for some tasks, like being a wiki interface or something.
Remember that some of these models can run on a standard smartphone, and all the hoopla when people found that chrome was downloading models onto people’s devices.