Data-driven Optimization and Control

Weierstrass Group

DE |
EN

Diese Seite auf Deutsch

Publications

Author:

Years:

Topic:

Type:

Show as:

List BibTeX

Articles in Refereed Journals

L. Liang, Z. Xu, K.-Ch. Toh, J.-J. Zhu, An inexact Halpern iteration with application to distributionally robust optimization, Journal of Optimization Theory and Applications, 206 (2025), pp. 58/1--58/41, DOI 10.1007/s10957-025-02744-y .
R.A. Vandermeulen, R. Saitenmacher, Generalized identifiability bounds for mixture models with grouped samples, IEEE Transactions on Information Theory, 70 (2024), pp. 2746--2758, DOI 10.1109/TIT.2024.3367433 .

Contributions to Collected Editions

E. Gladin, P. Dvurechensky, A. Mielke , J.-J. Zhu, Interaction-force transport gradient flows, in: Advances in Neural Information Processing Systems, Proceedings of NeurIPS 2024, the Thirty-Eighth Annual Conference on Neural Information Processing Systems, 2024, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, C. Zhang, eds., 37 of Proceedings of NeurIPS, Curran Associates, Inc., 2024, pp. 14484--14508.
P. Dvurechensky, J.-J. Zhu, Analysis of kernel mirror prox for measure optimization, in: International Conference on Artificial Intelligence and Statistics, 2--4 May 2024, Palau de Congressos, Valencia, Spain, S. Dasgupta, S. Mandt, Y. Li, eds., 238 of Proceedings of Machine Learning Research, 2024, pp. 2350--2358.
Abstract
Kernel mirror prox and RKHS gradient flow for mixed functional Nash equilibrium Pavel Dvurechensky , Jia-Jie Zhu Abstract The theoretical analysis of machine learning algorithms, such as deep generative modeling, motivates multiple recent works on the Mixed Nash Equilibrium (MNE) problem. Different from MNE, this paper formulates the Mixed Functional Nash Equilibrium (MFNE), which replaces one of the measure optimization problems with optimization over a class of dual functions, e.g., the reproducing kernel Hilbert space (RKHS) in the case of Mixed Kernel Nash Equilibrium (MKNE). We show that our MFNE and MKNE framework form the backbones that govern several existing machine learning algorithms, such as implicit generative models, distributionally robust optimization (DRO), and Wasserstein barycenters. To model the infinite-dimensional continuous- limit optimization dynamics, we propose the Interacting Wasserstein-Kernel Gradient Flow, which includes the RKHS flow that is much less common than the Wasserstein gradient flow but enjoys a much simpler convexity structure. Time-discretizing this gradient flow, we propose a primal-dual kernel mirror prox algorithm, which alternates between a dual step in the RKHS, and a primal step in the space of probability measures. We then provide the first unified convergence analysis of our algorithm for this class of MKNE problems, which establishes a convergence rate of O(1/N ) in the deterministic case and O(1/√N) in the stochastic case. As a case study, we apply our analysis to DRO, providing the first primal-dual convergence analysis for DRO with probability-metric constraints.

Preprints, Reports, Technical Reports

M. Liero, A. Mielke, O. Tse, J.-J. Zhu, Evolution of Gaussians in the Hellinger--Kantorovich--Boltzmann gradient flow, Preprint no. 3198, WIAS, Berlin, 2025, DOI 10.20347/WIAS.PREPRINT.3198 .
Abstract, PDF (1615 kByte)
This study leverages the basic insight that the gradient-flow equation associated with the relative Boltzmann entropy, in relation to a Gaussian reference measure within the Hellinger--Kantorovich (HK) geometry, preserves the class of Gaussian measures. This invariance serves as the foundation for constructing a reduced gradient structure on the parameter space characterizing Gaussian densities. We derive explicit ordinary differential equations that govern the evolution of mean, covariance, and mass under the HK--Boltzmann gradient flow. The reduced structure retains the additive form of the HK metric, facilitating a comprehensive analysis of the dynamics involved. We explore the geodesic convexity of the reduced system, revealing that global convexity is confined to the pure transport scenario, while a variant of sublevel semi-convexity is observed in the general case. Furthermore, we demonstrate exponential convergence to equilibrium through Polyak--Łojasiewicz-type inequalities, applicable both globally and on sublevel sets. By monitoring the evolution of covariance eigenvalues, we refine the decay rates associated with convergence. Additionally, we extend our analysis to non-Gaussian targets exhibiting strong log-Lambda-concavity, corroborating our theoretical results with numerical experiments that encompass a Gaussian-target gradient flow and a Bayesian logistic regression application.
A. Mielke, J.-J. Zhu, Hellinger--Kantorovich gradient flows: Global exponential decay of entropy functionals, Preprint no. 3176, WIAS, Berlin, 2025, DOI 10.20347/WIAS.PREPRINT.3176 .
Abstract, PDF (508 kByte)
We investigate a family of gradient flows of positive and probability measures, focusing on the Hellinger--Kantorovich (HK) geometry, which unifies transport mechanism of Otto--Wasserstein, and the birth-death mechanism of Hellinger (or Fisher--Rao). A central contribution is a complete characterization of global exponential decay behaviors of entropy functionals under Otto--Wasserstein and Hellinger-type gradient flows. In particular, for the more challenging analysis of HK gradient flows on positive measures---where the typical log-Sobolev arguments fail---we develop a specialized shape-mass decomposition that enables new analysis results. Our approach also leverages the Polyak--Łojasiewicz-type functional inequalities and a careful extension of classical dissipation estimates. These findings provide a unified and complete theoretical framework for gradient flows and underpin applications in computational algorithms for statistical inference, optimization, and machine learning.
CH. Bayer, L. Pelizzari, J.-J. Zhu, Pricing American options under rough volatility using deep-signatures and signature-kernels, Preprint no. 3172, WIAS, Berlin, 2025, DOI 10.20347/WIAS.PREPRINT.3172 .
Abstract, PDF (650 kByte)
We extend the signature-based primal and dual solutions to the optimal stopping problem recently introduced in [Bayer et al.: Primal and dual optimal stopping with signatures, to ap- pear in Finance & Stochastics 2025], by integrating deep-signature and signature-kernel learning methodologies. These approaches are designed for non-Markovian frameworks, in particular en- abling the pricing of American options under rough volatility. We demonstrate and compare the performance within the popular rough Heston and rough Bergomi models.

Talks, Poster

J.-J. Zhu, Inclusive KL minimization: A Wasserstein-Fisher-Rao gradient flow perspective, AABI 2025 - 7th Symposium on Advances of Approximate Bayesian Inference, Singapur, Singapore, April 29, 2025.
J.-J. Zhu, Minimizing relative entropy with Hellinger-Kantorovich (a.k.a. Wasserstein-Fisher-Rao) gradient flows, Understanding Generalization in Deep Learning, February 19 - 21, 2025, Technische Universität München, Department of Mathematics, Burghausen, February 19, 2025.
J.-J. Zhu, Minimizing relative entropy with Hellinger-Kantorovich (a.k.a. Wasserstein-Fisher-Rao) gradient flows, March 13 - 14, 2025, Ludwig Maximilian University of Munich, Department of Mathematics, March 13, 2025.
J.-J. Zhu, P. Dvurechensky, Analysis of kernel mirror prox for measure optimization, 27th Conference on Artificial Intelligence and Statistics (AISTATS), Valencia, Spain, May 2 - 4, 2024.
J.-J. Zhu, Approximation and kernelization of gradient flow geometry: Fisher--Rao and Wasserstein, The Mathematics of Data: Workshop on Optimal Transport and PDEs, January 17 - 23, 2024, National University of Singapore, Institute for Mathematical Sciences, Singapore, January 22, 2024.
J.-J. Zhu, Flow and transport: Modern mathematical foundation for statistical machine learning, University of St. Gallen, Department of Economics, Switzerland, October 17, 2024.
J.-J. Zhu, From distributional ambiguity to gradient flows: Wasserstein, Fisher--Rao, and kernel approximation, Seminar, College of Management of Technology, École Polytechnique Fédérale de Lausanne, Switzerland, November 28, 2024.
J.-J. Zhu, Gradient flows and kernelization in the Hellinger--Kantorovich (a.k.a. Wasserstein--Fisher--Rao) space, EUROPT 2024, 21st Conference on Advances in Continuous Optimization, June 26 - 28, 2024, Lund University, Department of Automatic Control, Sweden, June 28, 2024.
J.-J. Zhu, Kernel approximation of Wasserstein and Fisher--Rao gradient flows, The Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024), December 10 - 15, 2024, Neural Information Processing Systems Foundation, Vancouver, Canada, December 16, 2024.
J.-J. Zhu, Kernel approximation of Wasserstein and Fisher--Rao gradient flows, Probability and Stochastic Analysis Seminar, École Polytechnique Fédérale de Lausanne, Switzerland, November 27, 2024.
J.-J. Zhu, Kernel approximation of Wasserstein--Fisher--Rao gradient flows, DFG SPP 2298 Annual Meeting, November 10 - 12, 2024, Ludwig-Maximilians-Universität München, Department Mathematik, Tutzingen, November 11, 2024.
J.-J. Zhu, Kernelization, approximation, and entropy dissipation of gradient flows, January 24 - 26, 2024, RIKEN Center for Advanced Intelligence Project (AIP), Saitama, Japan.
J.-J. Zhu, Transport and flow: The modern mathematics of distributional learning and optimization, Universität des Saarlandes, Saarland Informatics Campus, Saarbrücken, July 5, 2024.

External Preprints

E. Gladin, P. Dvurechensky, A. Mielke , J.-J. Zhu, Interaction-force transport gradient flows, Preprint no. arXiv:2405.17075, Cornell University, 2024, DOI 10.48550/arXiv.2405.17075 .
Abstract
This paper presents a new type of gradient flow geometries over non-negative and probability measures motivated via a principled construction that combines the optimal transport and interaction forces modeled by reproducing kernels. Concretely, we propose the interaction-force transport (IFT) gradient flows and its spherical variant via an infimal convolution of the Wasserstein and spherical MMD Riemannian metric tensors. We then develop a particle-based optimization algorithm based on the JKO-splitting scheme of the mass-preserving spherical IFT gradient flows. Finally, we provide both theoretical global exponential convergence guarantees and empirical simulation results for applying the IFT gradient flows to the sampling task of MMD-minimization studied by Arbel et al. [2019]. Furthermore, we prove that the spherical IFT gradient flow enjoys the best of both worlds by providing the global exponential convergence guarantee for both the MMD and KL energy.

Data-driven Optimization and Control

Weierstrass Group

Publications

Articles in Refereed Journals

Contributions to Collected Editions

Preprints, Reports, Technical Reports

Talks, Poster

External Preprints

Flexible Research Platform