Research Group "Stochastic Algorithms and Nonparametric Statistics"

Research Seminar "Mathematical Statistics" SS 2022

  • Place: The seminar will be hybrid and realized via Zoom. Please note that the so-called ``3G rule" applies at Weierstrass Institute. Our lecture room ESH has according to hygiene recommendations only a capacity of 24 people. If you intend to participate you must register for our mailinglist with Andrea Fiebig ( Prior to each talk a doodle will be created where it is mandatory to sign in for attendance in person. Therefore, it is mandatory for those who want to participate in person to register (put your name in the list) using the doodle link sent by e-mail before the lecture. Please follow the streamed talk at , if 16 guests have already registered.
  • Time: Wednesdays, 10.00 a.m. - 12.30 p.m.
20.04.2022 Prof. Dr. Wolfgang Karl Härdle (BRC Blockchain Research Center)
代 DAI the Digital Art Index (hybrid talk)
The 代 DAI Digital Art Index has been developed to reflect the increasing activities on the the Digital Art market. Based on the most liquid exchanges, NFT data and prices are collected in cooperation with, NYC . The NFT art market has risen sharply recently and is competing with traditional arts market. The observed transactions are analysed and an index is developed on a hedonic regression framework. We present an introduction into NFTs, explain their construction and "huberize" the hedonic regression context.
27.04.2022 Dr. Nazgul Zakiyeva (Zuse-Institut Berlin/National University of Singapore)
Modeling and forecasting the dynamics of the natural gas transmission network in Germany with the demand and supply balance constraint (hybrid talk)
We develop a novel large-scale Network Autoregressive model with balance Constraint (NAC) to predict hour-ahead gas flows in the gas transmission network, where the total in- and out-flows of the network are balanced over time. By integrating recent advances in optimization and statistical modeling, the NAC model can provide an accurate hour-ahead forecast of the gas flow at all of the distribution points in the network. By detecting the influential nodes of the dynamic network, taking into account that demand and supply have to be balanced, the forecast can be used to compute an optimized schdule and resource allocation. We demonstrate an application of our model in forecasting hour-ahead gas in- and out-flows at 128 nodes in the German high-pressure natural gas transmission network over a time frame of 22 months. Link to the paper:
04.05.2022 Dr. Carlos Amendola (MPI Leipzig and TU Berlin)
Likelihood geometry of correlation models (hybrid talk)
Correlation matrices are standardized covariance matrices. They form an affine space of symmetric matrices defined by setting the diagonal entries to one. We study the geometry of maximum likelihood estimation for this model and linear submodels that encode additional symmetries. We also consider the problem of minimizing two closely related functions of the covariance matrix: the Stein's loss and the symmetrized Stein's loss. Unlike the Gaussian log-likelihood, these two functions are convex and hence admit a unique positive definite optimum. This is joint work with Piotr Zwiernik (University of Toronto).
11.05.2022 Prof. Dr. Vladimir Spokoiny (WIAS & HU Berlin)
Laplace approximation in high dimension with applications to statistical inference (hybrid talk)
This note revisits the classical results on Laplace approximation in a modern non-asymptotic and dimension free form. Such an extension is motivated by the uncertainty quantification for high dimensional statistical models. The established results provide an explicit non-asymptotic bounds on the quality of a Gaussian approximation of the posterior distribution in total variation distance in terms of the so called empheffective dimension p_G defined as interplay between information contained in the data and in the prior distribution. In the contrary to prominent Bernstein-von-Mises results, the impact of the prior is not negligible and it allows to keep the effective dimension small or moderate even if the true parameter dimension is huge or infinite. We also address the important issue of using a Laplace approximation with posterior mean in place of Maximum Aposteriori Probability (MAP).
18.05.2022 Dr. Zdeněk Hlávka (Charles University)
Testing dependencies in functional time series (hybrid talk)
We discuss tests of serial independence for a sequence of functional observations and tests of independence between two (or more) time series of functional observations. The tests are based on characteristic functions which are appropriately estimated from functional observations. The limit distribution of the new test statistic is obtained under the null hypothesis, while under alternatives it is shown that the test statistics almost surely diverge as the sample size increases. In a Monte Carlo study, we investigate appropriate resampling methods and we investigate the tests? performance in finite samples. Finally, an application illustrates the use of the method with real data from financial markets, including also cumulative intraday returns of Bitcoin and Ethereum.
25.05.2022 N. N.

01.06.2022 N. N.

08.06.2022 Mari Myllymäki (Natural Resources Institute Finland)
Global envelopes with applications to spatial statistics and functional data analysis (online talk)
Global envelopes are nowadays quite often used in testing null models for spatial processes by means of different summary functions, because they provide a formal test and provide suggestions for alternative models through graphical interpretation of the test results. Global envelopes are however a rather general tool that can be applied in various applications. Namely, they can be employed for central regions of functional or multivariate data, for graphical Monte Carlo and permutation tests where the test statistic is multivariate or functional, and for global confidence and prediction bands. In this talk, I describe the global envelopes, illustrate the methodology on different applications including the functional general linear model and show examples of the usage of the R package GET (Myllymäki and Mrkvička, 2020) that implements global envelopes. Further, I discuss the multiple testing correction in the global envelope tests for functional test statistics, which are discretized to m highly correlated hypotheses. While the global envelopes were first developed to control the family-wise error rate, also control of false discovery rate can be introduced. Myllymäki and Mrkvička (2020). GET: Global envelopes in R. arXiv:1911.06583 [stat.ME] htt- ps://
15.06.2022 Alexandra Carpentier (Universität Potsdam)
Optimal ranking for crowd-sourcing (hybrid talk)
22.06.2022 Stanislav Minsker (University of Southern California)
canceled! U-statistics of growing order and sub-Gaussian mean estimators (hybrid talk)
Since their introduction by Halmos and Hoeffding in mid-twentieth century, U-statistics became the source of interesting questions in probability theory and found many applications in mathematical statistics. Most of the existing body of research is devoted to the properties of U-statistics of fixed order. In this talk, we will present examples where U-statistics of order that grows with the sample size naturally appear. We will introduce a version of Bernstein's inequality for such U-statistics that exhibits optimal dependence on the variance parameter, discuss the main technical tools required to prove it, as well as its application to the estimation of a univariate mean. In the second part of the talk, we will discuss the Bahadur-Kiefer representation of the geometric median in high dimensions and its implications for robust estimation of the multivariate mean. Finally, we will establish connection between these topics and some open questions in the theory of U-processes.
29.06.2022 Anuj Srivastava (Florida State University)
Statistical shape analysis of complex natural structures (hybrid talk)
Shape analysis of structured data is a fast-growing field with broad applications. Advances in imaging techniques have led to a rich data source for analyzing shapes across many scientific disciplines. Examples include shapes of brain structures; morphological analysis of cancer cells, leaves, or botanical trees; shapes of geographical objects; human biometrics; shapes of the human genome; and so on. Shapes are relevant even in non-imaging data contexts, e.g., the shapes of COVID rate curves or the shapes of breathing pat- terns in sleep studies. What makes shape analysis both fascinating and challenging? The main difficulty stems from the fact that shape is generally an abstract notion that is hard to quantify. Imposing statistical models and inferences on shapes seem even more daunting. I will outline developments in a particular approach called elastic shape analysis. This approach defines shape as a quantity left after the effects of so-called nuisance transformations (often translations, rotations, and parameterizations) are removed. This removal requires representations and metrics that are invariant to these transformations. Elastic Riemannian metrics provide desired invariances but can be cumberso- me in practical implementations. A family of square-root type transformations has helped simplify these computations and make this approach feasible. These geometrical tools help us compute average shapes of 3D (botanical) trees, develop a shape alphabet for representing chromosomes as letter sequences, perform principal component analysis of arterial brain networks, and model dynamics of Entamoeba Histolytica in different liquid media. Furthermore, they help us address some fundamental scientific questions: Are the shapes of mitochondria in muscle tissues affected by the lifestyle (active versus sedentary)? Can we predict the onset of cognitive disorders using subcortical structures in the human brain? How does aging affect morphologies of arterial networks in human brains? I will present some of these applications of elastic shape analysis.
06.07.2022 Vincent Rivoirard (Université Paris Dauphine)
canceled! Nonparametric Bayesian estimation of nonlinear Hawkes process
Multivariate point processes are widely applied to model event-type data such as natural disasters, online message exchanges, financial transactions or neuronal spike trains. One very popular point process model in which the probability of occurrences of new events depend on the past of the process is the Hawkes process. In this work we consider the nonlinear Hawkes process, which notably models excitation and inhibition phenomena between dimensions of the process. In a nonparametric Bayesian estimation framework, we obtain concentration rates of the posterior distribution on the parameters, under mild assumptions on the prior distribution and the model. These results also lead to convergence rates of Bayesian estimators. Another object of interest in event-data modelling is to recover the graph of interaction of the phenomenon. We provide consistency guarantees on Bayesian methods for estimating this quantity; in particular, we prove that the posterior distribution is consistent on the graph adjacency matrix of the process, as well as a Bayesian estimator based on an adequate loss function. Joint work with Judith Rousseau and Deborah Sulem.
13.07.2022 N. N.
HVP 11a in room 3.04
20.07.2022 N. N.

last reviewed: June 23, 2022 by Christine Schneider