Kyle Lo
research
Iβm a research scientist at the Allen Institute for AI where I co-lead the OLMo project on open language modeling.
My current research focuses on large-scale pretraining of language models, with an emphasis on data curation and efficient experimentation. Iβm also interested in methods for specializing language models to domains. Iβm interested in AI for science and education, and graviate towards human-AI interaction problems like sensemaking over large collections or augmented reading interfaces. Finally, I also like building useful artifacts that support research, like open corpora and OCR tools.
me
I live in Seattle. When not working, I hang with my cat Belphegor and play board games (Robinson Crusoe, Aeon's End, Cthulu: Death May Die, Hanabi) and video games (Baldur's Gate 3, Valheim, Slay the Spire, Noita, Vampire Survivors). I love D&D and just finished a four year campaign in Eberron. Now embarking on a homebrew West Marches campaign while trying out some other systems like Blades in the Dark. I'm a boba enthusiast and my favorites in Seattle are Xing Fu Tang, TP Tea and Mustache Milk Tea.
news
| Oct 01, 2025 | Presenting three works at COLM 2025!π Fluid Language Model Benchmarking on language model evaluation, LLMs as Research Tools results from a large scale survey, and our latest recipe for training open language models 2 OLMo 2 Furious. Weβre handing out Blu-Ray discs with our OLMo 2 model weights and hosting an Ai2 event! Ready for some bagels π₯―. Also, I have a shiny new backpack to show offβ¦ |
|---|---|
| Jun 01, 2025 | Molmo and PixMo received a Best Paper Honorable Mention at CVPR 2025!π Congrats to Matt, Chris and our Ai2 team! |
| May 15, 2025 | Excited to welcome our 2025 interns: Mayee Chen, Yapei Chang, Amanda Bertsch, and Alexis Ross! π |
| May 01, 2025 | Organize the Web accepted to ICML 2025! Data isnβt just about βqualityβ, slice and dice by βtopicβ and βformatβ domains. Congrats Alex & see everyone in Vancouver π¨π¦! |
| May 01, 2025 | DrawEduMath won an Outstanding Paper Award at NAACL 2025! π Congrats to Lucy and the team! |
| Mar 13, 2025 | We released our largest and best model yet! OLMo 2 32B is trained using the same recipe from 2 OLMo 2 Furious, comparable base model performance to some of the best open-weight models like Qwen and Gemma. After instruction tuning, itβs the best fully open model to reach GPT 3.5/4o mini performance. Our blog post says more. As always, download the model weights, data, and everything on HuggingFace! |
| Jan 15, 2025 | OLMoE accepted as an Oral (top 1.8% of 11.6K submissions) at ICLR 2025! Congrats Niklas. Iβm also going to be giving a talk on data curation for OLMo 2 at the Data Problems for Foundation Models workshop. See you all in Singapore πΈπ¬! |
| Dec 01, 2024 | Giving a tutorial on Opening the Language Model Pipeline at NeurIPS 2024 with my colleagues Akshita Bhagia and Nathan Lambert! Weβll cover data preparation, model training, and adaptation methods using open software and data. Excited to share tips, tricks, and otherwise inaccessible details from building OLMo! |
| Nov 26, 2024 | OLMo 2 is here! In our latest paper π 2 OLMo 2 Furious π₯, we discuss everything weβve learned since OLMo 1 with deep dives into π stable pretraining and π mid-training which uses learning rate annealing, data curricula, and model checkpoint averaging. Our training recipe is state-of-the-art with respect to training FLOPs to performance!π Check out the blog post and download our 7B and 13B model weights, data, etc on HuggingFace! |
| Oct 01, 2024 | Excited that our Semantic Reader paper is published in Communications of the ACM! π₯³ This paper synthesizes our five years of AI and HCI research (50 researchers, 12 institutions) aimed at understanding reading challenges faced by scholars and how AI-powered intelligent interfaces can help. Check out the paper here! |
| Sep 25, 2024 | Molmo is out! Molmo is our family of open, late-fusion image π + text π¬ language models trained using a really high-quality dataset of images + dense captions / task demonstrations! β Read the paper here, β play with the model here, β download the weights here, and β look forward to our dataset release soon! |
| Sep 03, 2024 | OLMoE is out! Our first mixture of experts model in the OLMo family π OLMoE has only 1B active params but matches perf of larger dense models 𫨠and comes released with: β weights β data β code β ckpts β logs β detailed paper! Download the weights here and read the paper here! |
| Aug 14, 2024 | So proud to see both our OLMo and Dolma papers win π Best Paper awards π at ACL 2024 πΉπ |
| Jul 25, 2024 | Excited to be speaking at Gen Law workshop at ICML 2024 in π¦πΉ! Iβll be sharing fun pretraining data curation stories from OLMo, and my slides have cats! π |
| Jun 06, 2024 |
|
| Jun 01, 2024 | Welcome Summer 2024 interns! Excited to be working with Alex Wettig, Chaitanya Malaviya, Lucy Li, Rose Wang, and Vishakh Padmakumar! |
| May 16, 2024 | Four papers accepted to ACL 2024! π Two papers on open language models: OLMo for models and Dolma for data. Two papers on evaluating long-text generation: InfolossQA for omissions in medical summaries and KIWI for long-form QA over science papers. See yβall in Thailand! πΉπ |
| May 01, 2024 | Excited to welcome our new summer 2024 interns: Chaitanya Malaviya,Alex Wettig, Lucy Li, Rose Wang, and Vishakh Padmakumar! π |
| May 01, 2024 | omg attending back to back conferences. ICLR 2024 in Vienna π¦πΉπ₯ presenting Booookscore (Oral; top 1.2% of 7.2K submissions), evaluating discourse coherency in book-length summarization. CHI 2024 in Hawaii πΊπΈπ£ presenting two works on helping non-expert audiences understand research papers through AI: Paper Plain, an augmented reading interface over medical papers, and Know your Audience, a large-scale user study on benefits and pitfalls of plain language summarization. |
| Feb 01, 2024 | Excited to release our first set of artifacts from the OLMo project π₯³ Want models? Download our open-source weights at 1B and a pair of weights at 7B and 7B scale, trained on different hardware, on Huggingface. We also open-source all our training and inference code. Learn more from our paper. Want data? Download all 3T tokens on Huggingface. We also open-source all our dataset construction tools. Learn more from our paper. |
| Dec 12, 2023 | Happy to be rounding out the year with a Best Paper Award π for the EMNLP 2023 System Demo track for PaperMage! Also presenting EMNLP 2023 Main Conference and Findings accepted papers on Decontextualizing for Scientific Document Snippets, Tip-of-the-Tongue Retrieval, and Evaluating Multidocument Summarization with Retrieved Documents. Excited to see all my co-authors in Singapore! |
| Jun 15, 2023 | Welcome Summer 2023 interns! Excited to be working directly with Orion Weller, Hyunji Lee, Fangyuan Xu and Hang Jiang! |
| Apr 30, 2023 | Having a pretty good April :) Best paper award at CHI 23 π (CiteSee) and Outstanding paper award at EACL 23 π (LongEval). Thanks and congrats to all my co-authors! |
| Jan 01, 2023 | New year, new site! |