1-bit LLMs Could Solve AI’s Energy Demands

@floofloof@lemmy.ca · 30 days ago

1-bit LLMs Could Solve AI’s Energy Demands

@kromem@lemmy.world · edit-2 29 days ago

The network architecture seems to create a virtualized hyperdimensional network on top of the actual network nodes, so the node precision really doesn’t matter much as long as quantization occurs in pretraining.

If it’s post-training, it’s degrading the precision of the already encoded network, which is sometimes acceptable but always lossy. But being done at the pretrained layer it actually seems to be a net improvement over higher precision weights even if you throw efficiency concerns out the window.

You can see this in the perplexity graphs in the BitNet-1.58 paper.

@lunar17@lemmy.world · 29 days ago

None of those words are in the bible

@kromem@lemmy.world · edit-2 28 days ago

No, but some alarmingly similar ideas are in the heretical stuff actually.