Qinxin Yan (Princeton): Implicit regularization of early stopping for gradient descent: a mean field control formulation

Qinxin Yan, Princeton University

Event Date
2026-03-31
Event Time
03:30 pm ~ 04:30 pm
Event Location
Wachman 617

It is widely observed that overparameterized neural networks—often with more parameters than training samples—can interpolate the training data while still generalizing well. One theoretical approach to this phenomenon studies the dynamics of gradient-based training. Empirically and in several settings theoretically, gradient descent is seen to converge to particular “simpler” solutions among many minimizers, a bias commonly referred to as implicit regularization. Early stopping during the training process can further reduce effective model complexity and often improves generalization.

In this talk, we adopt the mean-field formulation on wide neural networks, representing the network by a probability measure over parameters and viewing training as a gradient flow on Wasserstein space. Building on this viewpoint, we introduce a mean-field control formulation of the training dynamics. This control perspective, together with dynamic programming principle, leads to a mean-field analogue of the Wasserstein-2 distance and provides a framework for analyzing early stopping and implicit regularization.