Validation in Statistics and Machine Learning - Abstract

Dickhaus, Thorsten

How normal can the t-statistic possibly be?

(joint work with Helmut Finner)

Revisiting the fundamental articles by Hsu (1945) and Chung (1946), we are concerned with Edgeworth expansions for self-normalized sums $S_n$ of iid real-valued random variables ($X_n$ : $n$ integer) with mean zero and variance 1. In Chung's (1946) original paper, only the terms contributing to the first polynomial (appearing in the first-order Edgeworth expansion) are given correctly. After encapsulating and fixing a well-hidden error, we extended Chung's method to compute the expansion up to arbitrary order.
Considering rates of convergence, we have that the cumulative distribution function (cdf) of $S_n$ converges to the cdf of the standard normal distribution at rate $O(n^-1/2)$ for non-vanishing skewness of $X_1$ and at rate $O(n^-1)$ otherwise. We show that it is possible to improve convergence rates by replacing the norming sequence in the denominator of $S_n$ and thereby introducing "generalized self-normalized sums" $T_n$. It turns out that utilizing $T_n$ instead of $S_n$ can lead to a rate of convergence up to $O(n^-2)$ for appropriate choices of the norming constants depending on the moments of $X_1$.
Finally, we study Edgeworth-type expansions for T_n in which we replace the standard normal distribution by Student's t-distribution with (n-1) degrees of freedom and analyze rates of convergence in this case.

References:

  1. Chung, K.-L. (1946). The approximate distribution of Student's statistics. Ann. Math. Stat. 17(5), 447-465.
  2. Finner, H., Dickhaus, T. (2010). Edgeworth expansions and rates of convergence for normalized sums: Chung's 1946 method revisited. Statistics and Probability Letters. In press.
  3. Hsu, P. L. (1945). The approximate distributions of the mean and variance of a sample of independent variables. Ann. Math. Stat. 16(5), 1-29.