Lucia Cipolina Kun

I am a Research Scientist at META’s FAIR lab, where I work on large language model (LLM) agents for scientific discovery. My research focuses on developing novel techniques for the self-improvement of LLM agents and their application to coding and reasoning. I develop frameworks combining reinforcement learning, multi-agent systems, and game theory to study emergent behaviors in post-trained LLMs, with a focus on designing systematic evaluation protocols that assess their strategic reasoning, planning, and behavioral stability.

My professional background includes roles as a Student Researcher at Google DeepMind, where I developed novel techniques for multi-agent decision processes, and as a Research Intern at Juelich Supercomputing Center, focusing on high-performance computing, large-scale generative models and LLM evaluation. I have also interned at JPMorgan AI Research optimizing reinforcement learning algorithms for trading, and at Microsoft AI, working on model-based offline reinforcement learning for Bing’s search and revenue management. Earlier in my career, I worked as a front office quant at Morgan Stanley, Bank of America and HSBC in NY, implementing machine learning and diffusion models in C++.

My education includes an MSc in Mathematics from the Courant Institute at the New York University, where I specialized in stochastic differential equations and a PhD in Electrical Engineering at the University of Bristol.

I am also involved in a side project exploring AI-based art restoration, where I investigate how machine learning can help preserve and restore cultural heritage.

news

Aug 17, 2025	Our work on Game Reasoning Arena is out. See LLM agents compete, strategize and adapt in dynamic game environments.
Dec 15, 2024	Our paper on LLM reasoning Alice in Wonderland: Simple Tasks Reveal Severe Generalization and Basic Reasoning Deficits in State-Of-the-Art Large Language Models has been accepted at NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning.
Jul 15, 2024	I have started an internship at Google DeepMind in the Game Theory and Agents team. My work will focus on multi-agent coordination.
Nov 07, 2023	Best student poster award at the First Multimodal AI Workshop in the UK
May 23, 2023	I have started an internship at the Tate Britain, where I applied diffusion models to a virtual restoration of the museum’s pieces. The work was part of the Tate’s collection first rehang in ten years.

selected publications

Game Reasoning Arena: A Framework and Benchmark for Assessing Reasoning Capabilities of Large Language Models via Game Play

Lucia Cipolina-Kun, Marianna Nezhurina, and Jenia Jitsev

2025

HTML
The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements

Bingchen Zhao, Despoina Magka, Minqi Jiang, and 20 more authors

2025

HTML
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models

M. Nezhurina, L. Cipolina-Kun, M. Cherti, and 1 more author

Science of DL workshop. Neurips, 2024

HTML