Kyle Lo

research

I’m a research scientist at the Allen Institute for AI where I co-lead the OLMo project on open language modeling.

My current research focuses on large-scale pretraining of language models, with an emphasis on data curation and efficient experimentation. I’m also interested in methods for specializing language models to domains. I’m interested in AI for science and education, and graviate towards human-AI interaction problems like sensemaking over large collections or augmented reading interfaces. Finally, I also like building useful artifacts that support research, like open corpora and OCR tools.

me

I live in Seattle. When not working, I hang with my cat Belphegor and play board games (Robinson Crusoe, Aeon's End, Cthulu: Death May Die, Hanabi) and video games (Baldur's Gate 3, Valheim, Slay the Spire, Noita, Vampire Survivors). I love D&D and just finished a four year campaign in Eberron. Now embarking on a homebrew West Marches campaign while trying out some other systems like Blades in the Dark. I'm a boba enthusiast and my favorites in Seattle are Xing Fu Tang, TP Tea and Mustache Milk Tea.

news

Nov 20, 2025	Today we released OLMo 3, the strongest fully-open model to-date! OLMo 3 32B Base is near Qwen 2.5 & Gemma 3 performance, and OLMo 3 32B Think is the first fully-open reasoning model to approach Qwen 3 levels. We release 12 training datasets corresponding to different staged training recipes, all open & accessible! Very proud of our team! 🥹
Oct 01, 2025	Presenting three works at COLM 2025!🎉 Fluid Language Model Benchmarking on language model evaluation, LLMs as Research Tools results from a large scale survey, and our latest recipe for training open language models 2 OLMo 2 Furious. We’re handing out Blu-Ray discs with our OLMo 2 model weights and hosting an Ai2 event! Ready for some bagels 🥯. Also, I have a shiny new backpack to show off…
Jun 01, 2025	Molmo and PixMo received a Best Paper Honorable Mention at CVPR 2025!🏆 Congrats to Matt, Chris and our Ai2 team!
May 15, 2025	Excited to welcome our 2025 interns: Mayee Chen, Yapei Chang, Amanda Bertsch, and Alexis Ross! 🎉
May 01, 2025	Organize the Web accepted to ICML 2025! Data isn’t just about “quality”, slice and dice by “topic” and “format” domains. Congrats Alex & see everyone in Vancouver 🇨🇦!
May 01, 2025	DrawEduMath won an Outstanding Paper Award at NAACL 2025! 🏆 Congrats to Lucy and the team!
Mar 13, 2025	We released our largest and best model yet! OLMo 2 32B is trained using the same recipe from 2 OLMo 2 Furious, comparable base model performance to some of the best open-weight models like Qwen and Gemma. After instruction tuning, it’s the best fully open model to reach GPT 3.5/4o mini performance. Our blog post says more. As always, download the model weights, data, and everything on HuggingFace!
Jan 15, 2025	OLMoE accepted as an Oral (top 1.8% of 11.6K submissions) at ICLR 2025! Congrats Niklas. I’m also going to be giving a talk on data curation for OLMo 2 at the Data Problems for Foundation Models workshop. See you all in Singapore 🇸🇬!
Dec 01, 2024	Giving a tutorial on Opening the Language Model Pipeline at NeurIPS 2024 with my colleagues Akshita Bhagia and Nathan Lambert! We’ll cover data preparation, model training, and adaptation methods using open software and data. Excited to share tips, tricks, and otherwise inaccessible details from building OLMo!
Nov 26, 2024	OLMo 2 is here! In our latest paper 🚗 2 OLMo 2 Furious 🔥, we discuss everything we’ve learned since OLMo 1 with deep dives into 🚖 stable pretraining and 🚔 mid-training which uses learning rate annealing, data curricula, and model checkpoint averaging. Our training recipe is state-of-the-art with respect to training FLOPs to performance!📈 Check out the blog post and download our 7B and 13B model weights, data, etc on HuggingFace!
Oct 01, 2024	Excited that our Semantic Reader paper is published in Communications of the ACM! 🥳 This paper synthesizes our five years of AI and HCI research (50 researchers, 12 institutions) aimed at understanding reading challenges faced by scholars and how AI-powered intelligent interfaces can help. Check out the paper here!
Sep 25, 2024	Molmo is out! Molmo is our family of open, late-fusion image 👀 + text 💬 language models trained using a really high-quality dataset of images + dense captions / task demonstrations! ✅ Read the paper here, ✅ play with the model here, ✅ download the weights here, and ✅ look forward to our dataset release soon!
Sep 03, 2024	OLMoE is out! Our first mixture of experts model in the OLMo family 🎉 OLMoE has only 1B active params but matches perf of larger dense models 🫨 and comes released with: ✅ weights ✅ data ✅ code ✅ ckpts ✅ logs ✅ detailed paper! Download the weights here and read the paper here!
Aug 14, 2024	So proud to see both our OLMo and Dolma papers win 🏆 Best Paper awards 🏆 at ACL 2024 🇹🇭
Jul 25, 2024	Excited to be speaking at Gen Law workshop at ICML 2024 in 🇦🇹! I’ll be sharing fun pretraining data curation stories from OLMo, and my slides have cats! 🐈
Jun 06, 2024	data isnt cheap 💸🥲 so glad to have opportunity to share my thoughts on training data & how maintaining access to diverse open data is critical for sustaining wider participation in AI research https://t.co/GjJoKxfnKe— Kyle Lo (@kylelostat) June 6, 2024
Jun 01, 2024	Welcome Summer 2024 interns! Excited to be working with Alex Wettig, Chaitanya Malaviya, Lucy Li, Rose Wang, and Vishakh Padmakumar!
May 16, 2024	Four papers accepted to ACL 2024! 🎉 Two papers on open language models: OLMo for models and Dolma for data. Two papers on evaluating long-text generation: InfolossQA for omissions in medical summaries and KIWI for long-form QA over science papers. See y’all in Thailand! 🇹🇭
May 01, 2024	Excited to welcome our new summer 2024 interns: Chaitanya Malaviya,Alex Wettig, Lucy Li, Rose Wang, and Vishakh Padmakumar! 🎉
May 01, 2024	omg attending back to back conferences. ICLR 2024 in Vienna 🇦🇹🥐 presenting Booookscore (Oral; top 1.2% of 7.2K submissions), evaluating discourse coherency in book-length summarization. CHI 2024 in Hawaii 🇺🇸🍣 presenting two works on helping non-expert audiences understand research papers through AI: Paper Plain, an augmented reading interface over medical papers, and Know your Audience, a large-scale user study on benefits and pitfalls of plain language summarization.
Feb 01, 2024	Excited to release our first set of artifacts from the OLMo project 🥳 Want models? Download our open-source weights at 1B and a pair of weights at 7B and 7B scale, trained on different hardware, on Huggingface. We also open-source all our training and inference code. Learn more from our paper. Want data? Download all 3T tokens on Huggingface. We also open-source all our dataset construction tools. Learn more from our paper.
Dec 12, 2023	Happy to be rounding out the year with a Best Paper Award 🏆 for the EMNLP 2023 System Demo track for PaperMage! Also presenting EMNLP 2023 Main Conference and Findings accepted papers on Decontextualizing for Scientific Document Snippets, Tip-of-the-Tongue Retrieval, and Evaluating Multidocument Summarization with Retrieved Documents. Excited to see all my co-authors in Singapore!
Jun 15, 2023	Welcome Summer 2023 interns! Excited to be working directly with Orion Weller, Hyunji Lee, Fangyuan Xu and Hang Jiang!
Apr 30, 2023	Having a pretty good April :) Best paper award at CHI 23 🏆 (CiteSee) and Outstanding paper award at EACL 23 🏆 (LongEval). Thanks and congrats to all my co-authors!
Jan 01, 2023	New year, new site!