People are talking about the new Llama 3.3 70b release, which has generally better performance than Llama 3.1 (approaching 3.1’s 405b performance): https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_3
However, something to note:
Llama 3.3 70B is provided only as an instruction-tuned model; a pretrained version is not available.
Is this the end of open-weight pretrained models from Meta, or is Llama 3.3 70b instruct just a better-instruction-tuned version of a 3.1 pretrained model?
Comparing the model cards: 3.1: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md 3.3: https://github.com/meta-llama/llama-models/blob/main/models/llama3_3/MODEL_CARD.md
The same knowledge cutoff, same amount of training data, and same training time give me hope that it’s just a better finetune of maybe Llama 3.1 405b.
deleted by creator