## Program Details

The workshop will start Wednesday, November 6, 2019, at 13:50 and end Friday, November 8, 2019 at 13:00.

The conference dinner will take place at Restaurant Via Nova II, Universitätsstraße 2.

## Wednesday, November 6, 2019:

 13:00 Registration & Lunch 13:50-14:00 Opening remarks by Enno Mammen (University Heidelberg) 14:00-14:50 Alexandre Tsybakov (CREST) 14:50-15:40 Sara van de Geer (ETH Zürich) 15:40-16:10 Coffee break 16:10-17:00 Joel Horowitz (Northwestern University) Non-Asymptotic Inference in Instrumental Variables Estimation This paper presents a simple method for carrying out inference in a wide variety of possibly nonlinear IV models under weak assumptions. The method is non-asymptotic in the sense that it provides a finite-sample bound on the difference between the true and nominal probabilities of rejecting a correct null hypothesis. The method is a non-Studentized version of the Anderson-Rubin test but is motivated and analyzed differently. In contrast to the conventional Anderson-Rubin test, the method proposed here does not require restrictive distributional assumptions, linearity of the estimated model, or simultaneous equations. Nor does it require knowledge of whether the instruments are strong or weak. It does not require testing or estimating the strength of the instruments. The method can be applied to quantile IV models that may be nonlinear and can be used to test a parametric IV model against a nonparametric alternative. The results presented here hold in finite samples, regardless of the strength of the instruments. 17:00-17:50 Xiaohong Chen (Yale University) Hilda Geiringer Lecture: Adaptive Testing in Instrumental Variables Models This paper is concerned with adaptive inference on a structural function in the semiparametric or nonparametric instrumental variables (NPIV) model. We propose a direct test statistic for hypothesis testing based on a leave-one-out, sieve NPIV estimator. Our test is applicable to identified and partially identified models. We analyze a class of alternative models which are separated from the null hypothesis by a \textit{rate of testing} which is sensitive to the form of identification. This rate of testing is shown to be minimax: The first type error and the second type error of our test, uniformly over the class of alternative models, cannot be improved by any other test. We also propose an adaptive test statistic that provides a data driven choice of tuning parameters and attains the minimax optimal rate of testing within a $\log\log n$ term. This paper concludes with a finite sample analysis of the testing procedure and empirical illustrations.

## Thursday, November 7, 2019:

 9:00-9:50 Wolfgang Härdle (Humboldt University Berlin) 9:50-10:40 Arkadi Nemirovski (Georgia Institite of Technology) Near-optimal recovery of linear and N-convex functions on unions of convex sets In the talk, based on joint research with Anatoli Iouditski, we present provably near-optimal in the minimax sense efficiently computable recovery of a linear form of unknown signal varying in a finite union of known convex compact sets from indirect noisy observations of the signal. The allowed observation schemes are: (1) Gaussian, where one observes a linear image of the signal corrupted by white Gaussian noise, (2) Poisson, where an observation has independent of each other Poisson entries, the parameters of the respective Poisson distributions being known affine functions of the signal, and (3) Discrete, where an observation is an i.i.d. sample from a distribution, affinely parameterized by the signal, on a finite set. The results are then extended to the situation where the function to be recovered is N-convex rather than linear, N-convexity meaning that the upper and the lower Lebesgue sets of the function are unions of at most N convex compact sets. 10:40-11:10 Coffee break 11:10-12:00 Denis Belomestny (University Duisburg-Essen) Reinforcement Learning via Reinforced Regression In this talk we propose a novel simulation based regression approach for approximate value iteration and approximate policy iteration in reinforcement learning. The main idea of the method is to reinforce the standard linear regression algorithms in each backward induction step by adding new basis functions based on the previously estimated Q functions. The proposed methodology is illustrated by several numerical examples. 12:00-12:50 David Donoho (Stanford University) The Statistical Significance of Perfect Linear Separation Suppose we have a set of $n$ iid Normally-distributed points in dimension $d$. A specific subset of $k$ points happens to be perfectly linearly separable from the other $n-k$. What is the exact probability of such an occurrence? With $k=1$ and $d=2$ or $3$, this was solved in Brad Efron’s 1965 Biometrika paper, who evaluated the expected number of vertices of the convex hull of a random set of $n$ points in dimensions $d=2$ and $d=3$. We will describe exact formulas for evaluating such quantities in arbitrary $n$ and $d$. These formulas involve a new distribution, which we call the underdispersed Binomial, and a novel problem in integral geometry, involving the intersection of a hyperplane with a certain highly symmetric but seemingly little-studied cone. Exact formulas for general problems of this type were derived by Affentranger and Schneider and by Vershik and Sporyshev. Such formulas are typically not easy to work with, but in our case we are able to get explicit results in all dimensions $n$ and $d$ and produce software. We are also able to complete a large deviations analysis and identify a number of interesting phase transitions. This is joint work with Hatef Monajemi (Stanford) and Jared Tanner (Oxford) 12:50-14:00 Lunch buffet 14:00-14:50 Anatoly Juditsky (University Grenoble Alpes) Polyhedral signal recovery from indirect observations We consider the problem of recovering linear image of unknown signal belonging to a given convex compact signal set from noisy observation of another linear image of the signal. We develop a simple generic efficiently computable nonlinear in observations polyhedral'' estimate along with computation-friendly techniques for its design and risk analysis. We demonstrate that under favorable circumstances the resulting estimate is provably near-optimal in the minimax sense, the favorable circumstances'' being less restrictive than the weakest known so far assumptions ensuring near-optimality of estimates which are linear in observations. Joint work with Arkadi Nemirovski. 14:50-15:40 Peter Bühlmann (ETH Zürich) Causal Regularization for Better Generalization The common notion of statistical inference deals with generalization from a data set to a new unobserved population from the same data-generating distribution. We discuss the problem when the new population comes from a different distribution than the one generating the observed data. We propose an approach which builds on distributional robustness and borrows ideas from causality. So-called anchor regression with a simple, yet effective causal regularization'' provides a novel methodology leading to predictive robustness and improved generalization and replicability. We provide some illustrations on bio-medical data. This is joint work with Dominik Rothenhäusler, Niklas Pfister, Nicolai Meinshausen and Jonas Peters. 15:40-16:10 Coffee break 16:10-17:00 Yurii Nesterov (CORE, Catholic University of Louvain) Relative Smoothness: New Paradigm in Convex Optimization Development and computational abilities of optimization methods crucially depend on the auxiliary tools provided to them by the method’s designers. During the first decades of Convex Optimization, the methods were based either on the proximal setup, allowing Euclidean projections onto the basic feasible sets, or on the linear minimization framework, which assumes a possibility to minimize a linear function over the feasible set. However, recently it was realized that any possibility of simple minimization of an auxiliary convex function leads to the efficient minimization methods for some family of more general convex functions, which are compatible with the first one. This compatibility condition, called relative smoothness, was firstly exploited for smooth convex functions (Bauschke, Bolt and Teboulle, 2016) and smooth strongly convex functions (Lu, Freund and Nesterov, 2018). In this talk we make the final step and show how to extend this framework onto the class of nonsmooth functions. We also discuss possible consequences and applications. 17:00-17:50 Alexander Goldenshluger (University of Haifa) Density deconvolution under general assumptions The subject of the talk is the problem of density deconvolution under general assumptions on the measurement error distribution. Typically deconvolution estimators are constructed using Fourier transform techniques, and it is assumed that the characteristic function of the measurement errors does not have zeros on the real line. This assumption is rather strong and is not fulfilled in many cases of interest. We develop a methodology for constructing optimal density deconvolution estimators in the general setting that covers both vanishing and non--vanishing characteristic functions of the measurement errors. We derive upper bounds on the risk of the proposed estimators and provide sufficient conditions under which zeros of the corresponding characteristic function have no effect on estimation accuracy. Moreover, we show that the derived conditions are also necessary in some specific problem instances. From 19:00 Conference Dinner

## Friday, November 8, 2019:

 09:00-09:50 Gilles Blanchard (University of Potsdam) Sketched learning using random moments We introduce and analyze a general framework for resource-efficient large-scale statistical learning by data sketching: a training data collection is compressed in one pass into a low-dimensional sketch (a vector of random empirical generalized moments) that should capture the information relevant to the considered estimation task. The estimation target is the minimizer of the population risk for a given loss function. An approximate minimizer of the empirical risk is computed from the sketch information only using a constrained moment matching principle. Sufficient sketch sizes to control the statistical error of this procedure are investigated. This principle is applied to different setups: PCA, clustering, and Gaussian mixture Modeling. (Joint work with R. Gribonval, N. Keriven and Y. Traonmilin.) 9:50-10:40 Ying Chen (National University of Singapore) Regularized partially functional autoregressive model We propose a partially functional autoregressive model (pFAR) to describe the dynamic evolution of serially correlated functional data. This model provides a unified framework to depict both the serial dependence on multiple lagged functional covariates and the associated relation with ultrahigh-dimensional exogenous scalar covariates. Estimation is conducted under a two-layer sparsity assumption, where only a small number of groups and elements are supposed to be active, yet their number and location are unknown in advance. We establish the asymptotic properties of the estimator and perform simulation studies to investigate its finite sample performance. We demonstrate the application of the pFAR model using daily natural gas flow curves data in the high pressure pipeline of German gas transmission network. The gas demand and supply are influenced by their historical values and 85 scalar covariates varying from price to temperature. The model provides insightful interpretation and good out-of-sample forecast accuracy compared to several popular alternative models. This is a joint work with Thorsten Koch and Xiaofei Xu. 10:40-11:10 Coffee break 11:10-12:00 Oleg Lepski (Aix-Marseille University) Structural adaptation in the density model This paper deals with non-parametric density estimation on $R^2$ from i.i.d observations. It is assumed that after unknown rotation of the coordinate system the coordinates of the observations are independent random variables whose densities belong to a H\"older class with unknown parameters. The minimax and adaptive minimax theories for this structural statistical model are developed.