I work at the intersection of statistics, machine learning, and optimization, focusing primarily on the design and analysis of efficient statistical methods. My current research is on statistical optimal transport and the mathematical theory behind transformers.
MIT, Mathematics, 2020 -
MIT, Mathematics, 2016 - 20
MIT, Mathematics, 2015 - 16
Princeton, ORFE, 2008 - 14
Georgia Tech, Mathematics, 2007 - 08
Education & Training
Ph.D. in Mathematics
Univ. of Paris 6 (now Sorbonne Univ.) - 2006
M. Sc. in Statistics & Actuarial Science
ISUP - 2003
B. Sc. in Applied Mathematics
Univ. of Paris 6 (now Sorbonne Univ.) - 2002
Transformers and Self-Attention dynamics
In this work, we view the self-attention layers of the transformer architecture as an interacting particle system that exhibits a long-time clusterting behavior. Cluster locations are determined by the initial tokens, confirming that transformers learn context-aware representations. The system of ODE and associated PDE have a quite rich stucture and this is a topic we are actively working on.
Wasserstein gradient flows
This talk gives an overview of our recent work on applications of Wasserstein gradient flows to problems arising in statistics and machine learning. The Wasserstein geometry and its extensions (notably Wasserstein-Fisher-Rao) provide a toolbox to develop particle-based optimization algorithms over probability measures. These ideas have been implememented in several examples such as variational inference and nonparametric maximum likelihood estimation.
Our group explores applications of novel mathematical ideas to biological data, including genomics data in collaboration with the Eric and Wendy Schmidt Center at the Broad Institute. Our past work has focused on using optimal transport and the Gromov-Wasserstein framework to combine multiple sources of data and we are currently exploring new tools for new applications, including spatial transcriptomics.
Interested in joining our group?
I cannot answer direct requests but you are encouraged to explore the various oppotunites at both the graduate and the postdoc levels. Make sure to check this page regularly, especially in the Fall.
PostdocFondations of Data Science Institute
I am a co-PI at FODSI, the Foundations of Data Science Institute. We are looking for postdocs starting September 2024. Please apply here and list me as one of your potential mentors.
Collaborative Research: CIF: Medium: Analysis and Geometry of Neural Dynamical Systems
TRIPODS: Foundations of Data Science Institute
BIGDATA:F: Statistical and Computational Optimal Transport for Geometric Data Analysis
The best way to contact me is via email