Davide Baldelli

MILA
Polytechnique Montréal
Chandar Research Lab
LAMA-WeST Lab

Hi, I’m Davide 😊. I am a second-year PhD student at MILA and Polytechnique Montréal, advised by Sarath Chandar and Amal Zouaq.

I am currently participating in MATS 10.0, an AI alignment fellowship, in the LawZero stream, where I am mentored by Damiano Fornasiere and Mirko Bronzi. We are working on introspection capabilities in language models.

Since the beginning of my PhD, I have worked on language models for computer-aided design, including building a large annotated CAD dataset and fine-tuning open-source code LLMs for text-driven sequential CAD generation. I then focused on memory for language-model agents, studying why standard chat-based agents cannot reliably generate and retain private secret state across interactions (for example, why they cannot properly play Hangman). More recently, I have worked on probabilistic calibration, addressing the problem that language models are often poorly calibrated random samplers through fine-tuning methods.

Previously, I was a researcher at the National Institute of Informatics (Tokyo) under the supervision of Akiko Aizawa. I received an MSc in Artificial Intelligence from the University of Bologna in October 2023, where I was supervised by Paolo Torroni.

NEWS

Jun 27, 2026	Our paper “LLMs Can’t Play Hangman: On the Necessity of a Private Working Memory for Language Agents” has been accepted to CoLLAs 2026 (Conference on Lifelong Learning Agents).
May 20, 2026	I have been accepted to MATS 10.0 in the LawZero stream, where I will do research in Berkeley with Damiano Fornasiere and Mirko Bronzi.
May 12, 2026	Our paper “Probabilistic Calibration Is a Trainable Capability in Language Models” is now on arXiv.
Jan 11, 2026	Our paper “LLMs Can’t Play Hangman: On the Necessity of a Private Working Memory for Language Agents” is now on arXiv.
Dec 20, 2025	Our paper “CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design” has been accepted at TMLR!
Aug 15, 2025	I have been selected to join ARENA 7.0 (Alignment Research Engineer Accelerator), which will be held at LISA (London Initiative for Safe AI), London, UK, in January 2026.
Jul 13, 2025	Our paper “CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design” is now on arXiv: arXiv:2507.09792.
Jun 10, 2025	I was selected to participate in the Y Combinator AI Startup School in San Francisco.
Jan 15, 2025	I started my PhD at MILA and Polytechnique Montréal with Sarath Chandar and Amal Zouaq.
Dec 10, 2023	Our paper “TWOLAR: a TWO-step LLM-Augmented distillation method for passage Reranking” has been accepted at ECIR 2024!
Oct 01, 2023	I obtained the MSc in AI with maximum grade from the University of Bologna.

SELECTED PUBLICATIONS

Probabilistic Calibration Is a Trainable Capability in Language Models

Davide Baldelli, Sruthi Kuriakose, Maryam Hashemzadeh, and 2 more authors

arXiv preprint arXiv:2605.11845, 2026

arXiv Code
LLMs Can’t Play Hangman: On the Necessity of a Private Working Memory for Language Agents

Davide Baldelli, Ali Parviz, Amal Zouaq, and 1 more author

In Conference on Lifelong Learning Agents (CoLLAs), 2026

arXiv
TWOLAR: a two-step LLM-augmented distillation method for passage reranking

Davide Baldelli, Junfeng Jiang, Akiko Aizawa, and 1 more author

In European Conference on Information Retrieval, 2024

arXiv Code
CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design

Prashant Govindarajan, Davide Baldelli, Jay Pathak, and 2 more authors

Transactions on Machine Learning Research, 2025

arXiv Code