Announcement_022

OLMo 2 is here! In our latest paper 🚗 2 OLMo 2 Furious 🔥, we discuss everything we’ve learned since OLMo 1 with deep dives into 🚖 stable pretraining and 🚔 mid-training which uses learning rate annealing, data curricula, and model checkpoint averaging. Our training recipe is state-of-the-art with respect to training FLOPs to performance!📈 Check out the blog post and download our 7B and 13B model weights, data, etc on HuggingFace!

Enjoy Reading This Article?