Research Group "Stochastic Algorithms and Nonparametric Statistics"
Research Seminar "Mathematical Statistics" WS 2022/23
|19.10.2022||Prof. Otmar Cronie (Chalmers University of Technology & University of Gothenburg)|
|10:00 am|| Point process learning: A cross-validation-based approach to statistics for point processes (hybrid talk)
Point processes are random sets which generalise the classical notion of a random (iid) sample by allowing i) the sample size to be random and/or ii) the sample points to be dependent. Therefore, point process have become ubiquitous in the modelling of spatial and/or temporal event data, e.g. earthquakes and disease cases. Motivated by cross validation's general ability to reduce overfitting and mean square error, in this talk, we present a new cross-validation-based statistical theory for general point processes. It is based on the combination of two novel concepts for general point processes: cross validation and prediction errors. Our cross-validation approach uses thinning to split a point process/pattern into pairs of training and validation sets, while our prediction errors measure discrepancy between two point processes. The new statistical approach exploits the prediction errors to measure how well a given model predicts validation sets using associated training sets. Due to its connection to the general idea of empirical risk minimisation, it is referred to as Point Process Learning. We discuss properties of the proposed approach and its components, and we illustrate how it may be applied in different spatial statistical settings. In (at least) one of these settings, we numerically show that it outperforms the state of the art.
|19.10.2022||Prof. David Frazier (Monash University, Australien)|
|11:00 am|| Guarenteed robustness via semi-modular posterior inference (hybrid talk)
Even in relatively simple settings, model misspecification can cause Bayesian inference methods to fail spectacularly. In situations where the underlying model is built by combining different modules, an approach to guard against misspecification is to employ cutting feedback methods. These methods modify conventional Bayesian posterior inference algorithms by artificially limiting the information flows between the (potentially) misspecified and correctly specified modules. By artificially limiting the flow of information when updating our prior beliefs, we essentially "cut" the link between these modules, and ultimately produce a posterior that differs from the exact posterior. However, it is generally unknown when one should prefer this "cut posterior" over the exact posterior. Rather than choosing a single posterior on which to base our inferences, we propose a new Bayesian method that combines both posteriors in such a way that we can guard against misspecification, and decrease posterior uncertainty. We derive easily verifiable conditions under which this new posterior produces inferences that are guaranteed to be more accurate than using either posterior by itself. We demonstrate this new method in a host of applications.
|02.11.2022||Johannes Schmidt-Hieber (University of Twente)|
| Overparametrization and the bias-variance dilemma
For several machine learning methods such as neural networks, good generalisation performance has been reported in the overparametrized regime. In view of the classical bias-variance trade-off, this behaviour is highly counterintuitive. The talk summarizes recent theoretical results on overparametrization and the bias-variance trade-off. This is joint work with Alexis Derumigny (Delft).
|09.11.2022||Claudia Schillings (FU Berlin)||HVP 11a in room 3.13||The convergence of the Laplace approximation and noise-level-robust computational methods for Bayesian inverse problems
The Bayesian approach to inverse problems provides a rigorous framework for the incorporation and quantification of uncertainties in measurements, parameters and models. We are interested in designing numerical methods which are robust w.r.t. the size of the observational noise, i.e., methods which behave well in case of concentrated posterior measures. The concentration of the posterior is a highly desirable situation in practice, since it relates to informative or large data. However, it can pose a computational challenge for numerical methods based on the prior measure. We propose to use the Laplace approximation of the posterior as the reference measure for the numerical integration and analyze the efficiency of Monte Carlo methods based on it.
|16.11.2022||Aila Särkkä (Chalmers University of Technology and University of Gothenburg)|
|Anisotropy analysis and modelling of spatial point patterns
In the early spatial point process literature, observed point patterns were typically small and no repetitions were available. It was natural to assume that the patterns were realizations of stationary and isotropic point processes. Nowadays, large data sets with repetitions have become more and more common and it is important to think about the validity of these assumptions. Non-stationarity has received quite a lot of attention during the recent years and it is straightforward to include it in many point process models. Isotropy, on the other hand, is often still assumed without further checking, and even though there are several tools suggested to detect isotropy and test for it, they have not been so widely used. This talk will give an overview of nonparametric methods for anisotropy analysis of (stationary) point processes (, , ). Methods based on nearest neighbour and second order summary statistics as well as on spectral and wavelet analysis will be discussed. The techniques will be illustrated on both a clustered and a regular example. In the second part of the talk, one of the methods will be used to estimate the deformation history in polar ice using the measured anisotropy of air inclusions from deep ice cores . In addition, an anisotropic point process model for nerve fiber data will be presented . References:  Konstantinou K and Särkkä A (2022). Pairwise interaction Markov model for 3D epidermal nerve fiber endings. To appear in Journal of Microscopy.  Rajala T, Särkkä A, Redenbach C, and Sormani M (2016). Estimating geometric anisotropy in spatial point patterns. Spatial Statistics 15, 100 -- 114.  Rajala T, Redenbach C, Särkkä A, and Sormani M (2018). A review on anisotropy analysis of spatial point patterns. Spatial Statistics 28, 141--168.  Rajala T, Redenbach C, Särkkä,A, and Sormani M (2022). Tests for isotropy in spatial point patterns. Under revision.  Sormani M, Redenbach C, Särkkä A and Rajala T (2020). Second order directional analysis of point processes revisited. Spatial Statistics 38, 100456.
|23.11.2022||Alexey Kroshnin (WIAS Berlin)|
|Robust k-means clustering in Hilbert and metric spaces
In this talk, we consider the robust algorithms for the k-means clustering (quantization) problem where a quantizer is constructed based on N independent observations. While the well-known asymptotic result by Pollard shows that the existence of two moments is sufficient for strong consistency of an empirically optimal quantizer in Rd, non-asymptotic bounds are usually obtained under the assumption of bounded support. We discuss a robust k-means in Hilbert and metric spaces based on trimming, and prove non-asymptotic bounds on the excess distortion, which depend on the probability mass of the lightest cluster and the second moment of the distribution.
|30.11.2022||Mikolaj Kasprzak (University of Luxembourg)|
|How good is your Laplace approximation? Finite-sample error
bounds for a variety of useful divergences
The Laplace approximation is a popular method of approximating an intractable Bayesian posterior by a suitably chosen Gaussian distribution. But can we trust this approximation for practical use? Its theoretical justification comes from the celebrated Bernstein-von Mises theorem (also known as the Bayesian CLT or BCLT). However, an obstacle to its wider use is the lack of widely applicable post-hoc checks on its quality. Our work provides closed-form, finite-sample quality bounds for the Laplace approximation that simultaneously(1) do not require knowing the true parameter, (2) control posterior means and variances, and (3) apply generally to models that satisfy the conditions of the asymptotic BCLT. In fact, our bounds work even in the presence of misspecification. We compute exact constants in our bounds for a variety of standard models, including logistic regression, and numerically demonstrate their utility. And we provide a framework for analysis of more complex models. This is joint work with Ryan Giordano (MIT) and Tamara Broderick (MIT). A preliminary version of the work is available here: https://arxiv.org/abs/2209.14992.
|07.12.2022||Matthias Vetter (Universität Kiel)|
|Cancelled due to unforeseen circumstances !|| On goodness-of-fit testing for point processes
Typical models for point processes like Hawkes processes or inhomogeneous Poisson processes are often of a parametric form where the intensity function or an additional self-exciting component is known up to an unspecified parameter. A lot of research since the seminal paper by Ogata (1978) has been devoted to the estimation of these unknown parameters but even in these rather standard models a consistent goodness-of-fit test has been missing. This talk aims to fill this gap. We will show how to formally set up a bootstrap procedure to allow for goodness-of-fit testing and we will discuss how to prove consistency of the test in the (already quite involved) case of an inhomogenous Poisson process.
|14.12.2022||Jovanka Lili Matic (HU, IRTG 1792)|
|Postponed to 15.02.23 !||Global sensitivity analysis in the presence of missing values
|04.01.2023||Franz Besold (WIAS Berlin)|
| Adaptive weights community detection
Due to the technological progress of the last decades, Community Detection has become a major topic in machine learning. However, there is still a huge gap between practical and theoretical results, as theoretically optimal procedures often lack a feasible implementation and vice versa. This paper aims to close this gap and presents a novel algorithm that is both numerically and statistically efficient. Our procedure uses a test of homogeneity to compute adaptive weights describing local communities. The approach was inspired by the Adaptive Weights Community Detection (AWCD) algorithm by Adamyan et al. (2019). This algorithm delivered some promising results on artificial and real-life data, but our theoretical analysis reveals its performance to be suboptimal on a stochastic block model. In particular, the involved estimators are biased and the procedure does not work for sparse graphs. We propose significant modifications, addressing both shortcomings and achieving a nearly optimal rate of strong consistency on the stochastic block model. Our theoretical results are illustrated and validated by numerical experiments.
|18.01.2023||Tim Jahn (Universität Bonn)|
|Discretisation-adaptive regularisation of statistical inverse problems
We consider linear inverse problems under white (non-Gaussian) noise. We introduce a discretisation scheme to apply the discrepancy principle and the heuristic discrepancy principle, which require bounded data norm. Choosing the discretisation dimension in an adaptive fashion yields convergence, without further restrictions for the operator, the distribution of the white noise or the unknown ground solution. We discuss connections to Lepski's method and apply the technique to ill-posed integral equations with noisy point evaluations. We show that here discretisation-adaptive regularisation can be used in order to reduce the numerical complexity.
|25.01.2023||Maria Grith (Erasmus University Rotterdam)|
|The block-autoregressive model in non-standard bases
We propose a new modeling approach for univariate time series with component dynamics that correspond to different aggregation schemes. The model is called the block-autoregressive (BAR) model, as it is based on the application of a vector autoregressive model to univariate data that is partitioned into 'blocks' of observations. These blocks of observations are transformed into linear combinations by an orthonormal basis change, which can unveil linear dynamics that can otherwise be obfuscated in the time-space. Model selection and estimation are intertwined and involve a sequential regression and testing procedure. We assess the goodness of fit of the estimated model with a parametric bootstrap and asymptotic chi-squared test. We rely upon a relationship between the covariance of the innovations in the time and transformed space, to identify and estimate the latent orthonormal basis of a process. We discuss several simulated and empirical examples which demonstrate both the flexibility and the explanatory potential of the BAR model.
|01.02.2023||Radu Stoica (Université de Lorraine, Nancy)|
| Random structures and patterns in spatio-temporal data: probabilistic modelling and statistical inference
The useful information carried by spatio-temporal data is often outlined by geometric structures and patterns. Filaments or clusters induced by galaxy positions in our Universe are such an example. Two situations are to be considered. First, the pattern of interest is hidden in the data set, hence the pattern should be detected. Second, the structure to be studied is observed, so relevant characterization of it should be done. This talk is structured in four parts. The first part presents the construction of different marked point processes together with their properties, such that characteristics of the patterns of interested are modelled by these processes. Second, tailored to the model MCMC dynamics to simulate the previous models are presented. A discussion related to the performances of these algorithms and comparison with exact simulation methods are provided. Third, on this basis, inference procedures are derived. They include level sets estimators, global optimisation, Approximate Bayesian Computation. Finally, applications on cosmological and geological real data are shown.
|08.02.2023||Vanessa Didelez (Universität Bremen)|
|Causal reasoning and causal discovery with applications in epidemiology
Many data analyses ultimately aim at answering causal research questions: We may want to assess and quantify the potential effects of certain decisions, interventions or policies, e.g. will a sugar tax or more playgrounds reduce childhood obesity? Will participation in a special training programme for the unemployed increase the chances of finding employment? Is a national mammography screening programme actually helpful in preventing deaths from breast cancer? Such questions are about causal relations and go beyond mere prediction; indeed, methods that are optimised for prediction will often give biased results for causal targets. Especially when we use non-experimental, i.e. observational data to try and answer questions about causal relations, tailored methods relying on specific assumptions are called for. The talk will review the main concepts, fundamental assumptions and basic principles for causal learning and focus on methods of causal discovery (aka structure learning). The latter have their roots in probabilistic approaches to artificial intelligence (AI) and, together with broader methods of causal inference in general, have recently seen a great revival in AI. This increased activity might be due to the realization "that many hard open problems of machine learning and AI are intrinsically related to causality" (Schölkopf, 2019). However, applications in epidemiology still pose a number of practical challenges; these include, for instance, handling incomplete, mixed, heterogenous and temporal data. I will illustrate some of the methods, their issues and proposed solutions with the analysis of a children`s cohort data.
|15.02.2023||Jovanka Lili Matic (HU, IRTG 1792)||Global sensitivity analysis in the presence of missing values
We investigate global variable importance in the presence of missing values. We assume that missing values are present in the random input vector. In this context, we analyse the dependence structure in the input vector under three missingness mechanisms: Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing Not a Random (MNAR) (Rubin and Mealli, 2015). In case of global independence, we define the Hoeffding-Sobol/Functional ANOVA decomposition and Sobol indices in the presence of missing values. We also show that no dependent input vector satisfies the requirements for the Stone-Hooker dependent variable ANOVA. First Order Sobol indices of observed variables provide bounds for Shapley effects in the presence of missing values. Lastly, we analyze the consequences of two common missing value management techniques: Complete-case analysis (CC) and Imputation. The study is concluded with a numerical simulation.
last reviewed: February 10, 2023 by Christine Schneider