Announcement_022

OLMo 2 is here! In our latest paper 🚗 2 OLMo 2 Furious 🔥, we discuss everything we’ve learned since OLMo 1 with deep dives into 🚖 stable pretraining and 🚔 mid-training which uses learning rate annealing, data curricula, and model checkpoint averaging. Our training recipe is state-of-the-art with respect to training FLOPs to performance!📈 Check out the blog post and download our 7B and 13B model weights, data, etc on HuggingFace!




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • a post with tabs
  • a post with typograms
  • a post that can be cited
  • a post with pseudo code
  • a post with code diff