Kyle Lo

kyle_lo_profile.jpg

research

I’m a research scientist at the Allen Institute for AI where I co-lead the OLMo project on open language modeling.

My current research focuses on large-scale pretraining of language models, with an emphasis on data curation and efficient experimentation. I’m also interested in methods for specializing language models to domains. I’m interested in AI for science and education, and graviate towards human-AI interaction problems like sensemaking over large collections or augmented reading interfaces. Finally, I also like building useful artifacts that support research, like open corpora and OCR tools.

me

I live in Seattle. When not working, I hang with my cat Belphegor and play board games (Robinson Crusoe, Aeon's End, Cthulu: Death May Die, Hanabi) and video games (Baldur's Gate 3, Valheim, Slay the Spire, Noita, Vampire Survivors). I love D&D and just finished a four year campaign in Eberron. Now embarking on a homebrew West Marches campaign while trying out some other systems like Blades in the Dark. I'm a boba enthusiast and my favorites in Seattle are Xing Fu Tang, TP Tea and Mustache Milk Tea.

news

Oct 01, 2025 Presenting three works at COLM 2025!πŸŽ‰ Fluid Language Model Benchmarking on language model evaluation, LLMs as Research Tools results from a large scale survey, and our latest recipe for training open language models 2 OLMo 2 Furious. We’re handing out Blu-Ray discs with our OLMo 2 model weights and hosting an Ai2 event! Ready for some bagels πŸ₯―. Also, I have a shiny new backpack to show off…
Jun 01, 2025 Molmo and PixMo received a Best Paper Honorable Mention at CVPR 2025!πŸ† Congrats to Matt, Chris and our Ai2 team!
May 15, 2025 Excited to welcome our 2025 interns: Mayee Chen, Yapei Chang, Amanda Bertsch, and Alexis Ross! πŸŽ‰
May 01, 2025 Organize the Web accepted to ICML 2025! Data isn’t just about β€œquality”, slice and dice by β€œtopic” and β€œformat” domains. Congrats Alex & see everyone in Vancouver πŸ‡¨πŸ‡¦!
May 01, 2025 DrawEduMath won an Outstanding Paper Award at NAACL 2025! πŸ† Congrats to Lucy and the team!
Mar 13, 2025 We released our largest and best model yet! OLMo 2 32B is trained using the same recipe from 2 OLMo 2 Furious, comparable base model performance to some of the best open-weight models like Qwen and Gemma. After instruction tuning, it’s the best fully open model to reach GPT 3.5/4o mini performance. Our blog post says more. As always, download the model weights, data, and everything on HuggingFace!
Jan 15, 2025 OLMoE accepted as an Oral (top 1.8% of 11.6K submissions) at ICLR 2025! Congrats Niklas. I’m also going to be giving a talk on data curation for OLMo 2 at the Data Problems for Foundation Models workshop. See you all in Singapore πŸ‡ΈπŸ‡¬!
Dec 01, 2024 Giving a tutorial on Opening the Language Model Pipeline at NeurIPS 2024 with my colleagues Akshita Bhagia and Nathan Lambert! We’ll cover data preparation, model training, and adaptation methods using open software and data. Excited to share tips, tricks, and otherwise inaccessible details from building OLMo!
Nov 26, 2024 OLMo 2 is here! In our latest paper πŸš— 2 OLMo 2 Furious πŸ”₯, we discuss everything we’ve learned since OLMo 1 with deep dives into πŸš– stable pretraining and πŸš” mid-training which uses learning rate annealing, data curricula, and model checkpoint averaging. Our training recipe is state-of-the-art with respect to training FLOPs to performance!πŸ“ˆ Check out the blog post and download our 7B and 13B model weights, data, etc on HuggingFace!
Oct 01, 2024 Excited that our Semantic Reader paper is published in Communications of the ACM! πŸ₯³ This paper synthesizes our five years of AI and HCI research (50 researchers, 12 institutions) aimed at understanding reading challenges faced by scholars and how AI-powered intelligent interfaces can help. Check out the paper here!
Sep 25, 2024 Molmo is out! Molmo is our family of open, late-fusion image πŸ‘€ + text πŸ’¬ language models trained using a really high-quality dataset of images + dense captions / task demonstrations! βœ… Read the paper here, βœ… play with the model here, βœ… download the weights here, and βœ… look forward to our dataset release soon!
Sep 03, 2024 OLMoE is out! Our first mixture of experts model in the OLMo family πŸŽ‰ OLMoE has only 1B active params but matches perf of larger dense models 🫨 and comes released with: βœ… weights βœ… data βœ… code βœ… ckpts βœ… logs βœ… detailed paper! Download the weights here and read the paper here!
Aug 14, 2024 So proud to see both our OLMo and Dolma papers win πŸ† Best Paper awards πŸ† at ACL 2024 πŸ‡ΉπŸ‡­
Jul 25, 2024 Excited to be speaking at Gen Law workshop at ICML 2024 in πŸ‡¦πŸ‡Ή! I’ll be sharing fun pretraining data curation stories from OLMo, and my slides have cats! 🐈
Jun 06, 2024
Jun 01, 2024 Welcome Summer 2024 interns! Excited to be working with Alex Wettig, Chaitanya Malaviya, Lucy Li, Rose Wang, and Vishakh Padmakumar!
May 16, 2024 Four papers accepted to ACL 2024! πŸŽ‰ Two papers on open language models: OLMo for models and Dolma for data. Two papers on evaluating long-text generation: InfolossQA for omissions in medical summaries and KIWI for long-form QA over science papers. See y’all in Thailand! πŸ‡ΉπŸ‡­
May 01, 2024 Excited to welcome our new summer 2024 interns: Chaitanya Malaviya,Alex Wettig, Lucy Li, Rose Wang, and Vishakh Padmakumar! πŸŽ‰
May 01, 2024 omg attending back to back conferences. ICLR 2024 in Vienna πŸ‡¦πŸ‡ΉπŸ₯ presenting Booookscore (Oral; top 1.2% of 7.2K submissions), evaluating discourse coherency in book-length summarization. CHI 2024 in Hawaii πŸ‡ΊπŸ‡ΈπŸ£ presenting two works on helping non-expert audiences understand research papers through AI: Paper Plain, an augmented reading interface over medical papers, and Know your Audience, a large-scale user study on benefits and pitfalls of plain language summarization.
Feb 01, 2024 Excited to release our first set of artifacts from the OLMo project πŸ₯³ Want models? Download our open-source weights at 1B and a pair of weights at 7B and 7B scale, trained on different hardware, on Huggingface. We also open-source all our training and inference code. Learn more from our paper. Want data? Download all 3T tokens on Huggingface. We also open-source all our dataset construction tools. Learn more from our paper.
Dec 12, 2023 Happy to be rounding out the year with a Best Paper Award πŸ† for the EMNLP 2023 System Demo track for PaperMage! Also presenting EMNLP 2023 Main Conference and Findings accepted papers on Decontextualizing for Scientific Document Snippets, Tip-of-the-Tongue Retrieval, and Evaluating Multidocument Summarization with Retrieved Documents. Excited to see all my co-authors in Singapore!
Jun 15, 2023 Welcome Summer 2023 interns! Excited to be working directly with Orion Weller, Hyunji Lee, Fangyuan Xu and Hang Jiang!
Apr 30, 2023 Having a pretty good April :) Best paper award at CHI 23 πŸ† (CiteSee) and Outstanding paper award at EACL 23 πŸ† (LongEval). Thanks and congrats to all my co-authors!
Jan 01, 2023 New year, new site!