Research Group "Stochastic Algorithms and Nonparametric Statistics"

Research Seminar "Mathematical Statistics" Summer Semester 2026

22.04.2026 Eddie Aamari (École Normale Supérieure, Paris) & Arthur Stéphanovich (ENSAE-CREST, Paris)
3rd part of Mini-Course: "Flow-based generative models: Regularity, stability, and minimax rates"
29.04.2026 Nicola Gnecco (Imperial College London)
Extremes of structural causal models
The behaviour of extreme observations is well-understood for time series or spatial data, but little is known if the data generating process is a structural causal model (SCM). We study the behavior of extremes in this model class, both for the observational distribution and under extremal interventions. We show that under suitable regularity conditions on the structure functions, the extremal behavior is described by a multivariate Pareto distribution, which can be represented as a new SCM on an extremal graph. Importantly, the latter is a sub-graph of the graph in the original SCM, which means that causal links can disappear in the tails. We further introduce a directed version of extremal graphical models and show that an extremal SCM satisfies the corresponding Markov properties. Based on a new test of extremal conditional independence, we propose two algorithms for learning the extremal causal structure from data. The first is an extremal version of the PC-algorithm, and the second is a pruning algorithm that removes edges from the original graph to consistently recover the extremal graph. The methods are illustrated on river data with known causal ground truth. Organiser: Katarzyna Reluga
06.05.2026 Vladimir Spokoiny (WIAS Berlin)
Estimation of a smooth functional for inverse problems
13.05.2026 Holger Dette (Ruhr University Bochum)
Multiple change point detection in functional data with applications to biomechanical fatigue data
Injuries to the lower extremity joints are often debilitating, particularly for professional athletes. Understanding the onset of stressful conditions on these joints is therefore important in order to ensure prevention of injuries as well as individualised training for enhanced athletic performance. We study the biomechanical joint angles from the hip, knee and ankle for runners who are experiencing fatigue. The data is cyclic in nature and densely collected by body worn sensors, which makes it ideal to work with in the functional data analysis (FDA) framework. We develop a new method for multiple change point detection for functional data, which improves the state of the art with respect to at least two novel aspects. First, the curves are compared with respect to their maximum absolute deviation, which leads to a better interpretation of local changes in the functional data compared to classical $L^2$-approaches. Secondly, as slight aberrations are to be often expected in a human movement data, our method will not detect arbitrarily small changes but hunts for relevant changes, where maximum absolute deviation between the curves exceeds a specified threshold, say $\Delta >0$. We recover multiple changes in a long functional time series of biomechanical knee angle data, which are larger than the desired threshold $\Delta$, allowing us to identify changes purely due to fatigue. In this work, we analyse data from both controlled indoor as well as from an uncontrolled outdoor (marathon) setting.
20.05.2026 Johannes Schmidt-Hieber (University of Twente)
A new neural network architecture for learning convex functions
For small covariate dimension, shape constrained inference is a well-established topic within nonparametric statistics. For large covariate dimensions, one naturally wants to introduce machine learning based methods. Input convex neural networks (ICNNs) were designed as a network architecture to learn convex functions. In this talk, we introduce Hyper Input Convex Neural Networks (HyCNNs). HyCNNs combine the principles of Maxout networks with ICNNs to create a neural network that is always convex in the input, theoretically capable of leveraging depth, and performs reliable when trained at scale compared to ICNNs. Concretely, we prove that HyCNNs require exponentially fewer parameters than ICNNs to approximate quadratic functions up to a given precision. Throughout a series of synthetic experiments, we demonstrate that HyCNNs outperform existing ICNNs and MLPs in terms of predictive performance for convex regression and interpolation tasks. WWe further apply HyCNNs to learn high-dimensional optimal transport maps for synthetic examples and for single-cell RNA sequencing data, where they oftentimes outperform ICNN-based neural optimal transport methods and other baselines across a wide range of settings. For more details, see arxiv.org/pdf/2604.26942. This is joint work with Shayan Hundrieser and Insung Kong.
27.05.2026 Alexander Meister (Universität Rostock)
Asymptotic equivalence for nonparametric additive regression
We prove asymptotic equivalence of nonparametric additive regression and an appropriate Gaussian white noise experiment in which a multidimensional shifted Wiener process is observed, whose dimension equals the number of additive components. The shift depends on the additive components of the regression function and solely the one- and two-dimensional marginal distributions of the covariates via an explicitly specified bounded but non compact linear operator. The number of additive components is allowed to increase moderately with respect to the sample size. In the special case of pairwise independent components of the covariates, the white noise model decomposes into independent univariate processes. Moreover, we study approximation in some semiparametric setting where the operator splits into a multiplication operator and an asymptotically negligible Hilbert-Schmidt operator. This talk is based on a joint work with Moritz Jirak (Universität Wien) and Angelika Rohde (Albert-Ludwigs-Universität Freiburg).
03.06.2026 N.N.

10.06.2026 Michael Sørensen (University of Copenhagen)
Recent developments in likelihood inference for stochastic differential equations
The complexity of likelihood inference for stochastic differential equations based on discrete time samples often necessitates the use of approximations or computational techniques. Approximate likelihood methods for high frequency data have often been used in financial econometrics, but these methods usually do not perform well for strongly nonlinear models. New developments of approximate likelihood methods based on splitting schemes are presented. These methods perform well also for strongly nonlinear models and at moderate sampling frequencies. Splitting schemes were originally introduced to solve ODEs and SDEs numerically, but in Pilipovic, Samson and Ditlevsen (2024) it was proposed to use them for statistical inference. In the talk a more general approach is presented that is applicable to a broad class of diffusion models. The theory is developed in the framework of approximate martingale estimating functions, which provide approximations to the score function and estimators that are efficient for high frequency data. For Strang splitting an approximate martingale estimating function of order 3 is obtained. Sometimes useful models with an explicit likelihood function can be found. This enables exact likelihood inference, which works at all sampling frequencies. As an example of this, a class of stochastic differential equation models on the torus is presented, which can be used to analyse time series of angular data. These diffusion processes are ergodic and time-reversible and can be constructed for any pre-specified stationary distribution on the torus. If time permits, applications to biological data will be briefly presented. The lecture is based on joint work with Susanne Ditlevsen, Adeline Samson and Eduardo García-Portugués. Reference: Pilipovic, P., Samson, A. And Ditlevsen, S. (2024): Efficient estimation for ergodic diffusion processes sampled at high frequency. Ann. Statist., 52, 842 - 867.
17.06.2026 Anna Calissano (University College London)
Statistical analysis of spatial graphs
A spatial graph is a specific type of graph with spatial attributes associated with the nodes and the edges. It is a smart modelling choice for capturing the skeleton of a shape, a blood vessel network, a porous tissue, and many other data objects with intrinsically complex geometry. In this talk, we describe how spatial graphs can be analysed using a specific metric (the Fused Gromov–Wasserstein metric). We extend a testing procedure between distributions of spatial graphs, a depth measure to describe the distribution of spatial graphs, and a dimensionality reduction procedure based on preserving key topological features. We present this variety of methods on a dataset of cardiac fibrosis tissue and on a dataset of fungus mycelium networks.
24.06.2026 Bertrand Even (Université Paris-Saclay)

01.07.2026
HVP 11 a, R.313
08.07.2026 Gilles Blanchard (Université Paris Saclay)
HVP 11 a, R.313
15.07.2026



last reviewed: May 26, 2026 by Christine Schneider