Validation in Statistics and Machine Learning - Abstract

Hothorn, Torsten

Reproducible Statistical Analyses Today

Reproducibility of results reported in scientific publications is a main requirement of good scientific practice, or, as Wikipedia has it: "Reproducibility is one of the main principles of the scientific method, and refers to the ability of a test or experiment to be accurately reproduced, or replicated, by someone else working independently." (wiki page) Reproducibility in our own profession refers to the ability to reanalyse data using potentially very complex and computerintensive statistical methods. Although the quality assurance aspect of reproducibility is an important one, a reproducible statistical analysis also helps to transfer state-of-the-art statistical methodology into practice and to facilitate further methodological research. What are the standards, techniques, and problems in reproducible statistics today? I'll comment on these issues based on my experiences as an author of both methodological and application papers and two books that aim to be reproducible, and as Associate Editor for Reproducibility of the "Biometrical Journal".