Davide Baldelli

davide.baldelli@mila.quebec - davide.baldelli@etud.polymtl.ca

Hi, I’m Davide 😊. I am a second-year PhD student at MILA and Polytechnique Montréal, advised by Sarath Chandar and Amal Zouaq.

I am currently participating in MATS 10.0, an AI alignment fellowship, in the LawZero stream, where I am mentored by Damiano Fornasiere and Mirko Bronzi. We are working on introspection capabilities in language models.

Since the beginning of my PhD, I have worked on language models for computer-aided design, including building a large annotated CAD dataset and fine-tuning open-source code LLMs for text-driven sequential CAD generation. I then focused on memory for language-model agents, studying why standard chat-based agents cannot reliably generate and retain private secret state across interactions (for example, why they cannot properly play Hangman). More recently, I have worked on probabilistic calibration, addressing the problem that language models are often poorly calibrated random samplers through fine-tuning methods.

Previously, I was a researcher at the National Institute of Informatics (Tokyo) under the supervision of Akiko Aizawa. I received an MSc in Artificial Intelligence from the University of Bologna in October 2023, where I was supervised by Paolo Torroni.

NEWS

May 20, 2026 I have been accepted to MATS 10.0 in the LawZero stream, where I will do research in Berkeley with Damiano Fornasiere and Mirko Bronzi.
May 12, 2026 Our paper “Probabilistic Calibration Is a Trainable Capability in Language Models” is now on arXiv.
Jan 11, 2026 Our paper “LLMs Can’t Play Hangman: On the Necessity of a Private Working Memory for Language Agents” is now on arXiv.
Dec 20, 2025 Our paper CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design has been accepted at TMLR!
Aug 15, 2025 I have been selected to join ARENA 7.0 (Alignment Research Engineer Accelerator), which will be held at LISA (London Initiative for Safe AI), London, UK, in January 2026.
Jul 13, 2025 Our paper “CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design” is now on arXiv: arXiv:2507.09792.
Jun 10, 2025 I was selected to participate in the Y Combinator AI Startup School in San Francisco.
Jan 15, 2025 I started my PhD at MILA and Polytechnique Montréal with Sarath Chandar and Amal Zouaq.
Dec 10, 2023 Our paper “TWOLAR: a TWO-step LLM-Augmented distillation method for passage Reranking” has been accepted at ECIR 2024!
Oct 01, 2023 I obtained the MSc in AI with maximum grade from the University of Bologna.

SELECTED PUBLICATIONS

  1. calibration-sft.png
    Probabilistic Calibration Is a Trainable Capability in Language Models
    Davide Baldelli, Sruthi Kuriakose, Maryam Hashemzadeh, and 2 more authors
    arXiv preprint arXiv:2605.11845, 2026
  2. hangman.jpg
    LLMs Can’t Play Hangman: On the Necessity of a Private Working Memory for Language Agents
    Davide Baldelli, Ali Parviz, Amal Zouaq, and 1 more author
    arXiv preprint arXiv:2601.06973, 2026
  3. twolar.png
    TWOLAR: a two-step LLM-augmented distillation method for passage reranking
    Davide Baldelli, Junfeng Jiang, Akiko Aizawa, and 1 more author
    In European Conference on Information Retrieval, 2024
  4. cadmium.png
    CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design
    Prashant Govindarajan, Davide Baldelli, Jay Pathak, and 2 more authors
    Transactions on Machine Learning Research, 2025