September 3, 2018 - September 7, 2018
Experimental or observational data of high or infinite dimensionality are getting common in institutes of all sections of the Leibniz Association. This creates an increasing demand for adequate modern data analysis techniques. At the same time reproducibility of experiments and their statistical analyses lead to new requirements for good scientific practice and requests for open source and open science.
Both topics have been addressed in a way that provides knowledge transfer from mathematical and applied statistics into the various scientific communities and helps to develop skills in R programming, statistical modeling and reproducible data analysis. Conceptually, the Summer School was problem-oriented and the program included both lectures, as well as programming training session in the afternoons.
The plenary lectures provided the necessary background knowledge for the work in the project groups and focused on three subject areas:
- introduction to the R statistical environment, working with R and Rstudio, writing dynamic documents with RMarkdown, version control using git and good scientific practice in general,
- methods and models for dimension reduction, in both classical statistics and when dimensionality gets large compared to sample size, strategies in multiple testing and variable selection,
- an introduction into modern functional data analysis models.
For the practical application part, participants had been asked to provide problems and data sets prior to the Summer School. After presentation of the five selected topics in a plenary session, participants chose their project groups according to their interest and developed analysis strategies and skills in R programming, statistical modeling and reproducible data analysis. In the final plenary session, the training groups reported on their results based on the RMarkdown documents created during the week.
We thank all the 24 PhD students for their very engaged and active participantion. According to their feedback, they could appreciably improve their skills in data processing and analysis using R, in doing reproducible research with dynamic documents and version control, and have been further trained in interdisciplinary communication with researchers from other diciplines. They got insights into modern statistical concepts and new ideas about how to proceed in their own research (see quoted statements below).
A particular thank goes to the speakers
- Dr. Clara Happ (LMU Munich, AG Biostatistics): Functional Data Analysis
- Dr. Joerg Polzehl (WIAS): Modeling High-dimensional Data; person in charge of the scientific program
- Dr. Heidi Seibold (LMU Munich, Institute for Medical Information Processing, Biometry, and Epidemiology): R, Open Science, Reproducible Research
- Almond Stöcker (LMU Munich, AG Biostatistics): Functional Data Analysis
- Dr. Alexandra Suvorikova (University of Potsdams): Mathematical Statistics, Multiple testing
Last, but not least, we thank the Mathematical Research Institute Oberwolfach (MFO) for providing the venue, supplying the group with rooms, equipment, coffee and assistance in any form necessary.
Opinions of Summer School participants and tutors:
“I really enjoyed my experience at the summer school. The mountains, the valley, and the forest are breathtaking to make your mind fresh. I truly enjoyed every single moment of my stay in Oberwolfach. The dynamics of our group was highly diversified. We had very friendly, talented, and young doctoral students coming from a broad range of academic backgrounds. The course facilitators and the tutors were excellent to simplify the complex statistical modeling concepts. The group works we did on machine learning methods, for example on clustering algorithms, and exercises on R markdown, Git repository, good scientific practice, and other advanced techniques were very helpful to understand the theoretical concepts. I found the course very helpful to my research project. Do want to learn cutting-edge statically modeling with a deep insight being at a very beautiful place? Then, you should definitely consider going to Oberwolfach MFO.”
“The MMs Summer School was a fantastic statistics workshop. The beautiful atmosphere at the MFO and the location made it possible to work concentrated on the workshop and at the same time find recovery in the nature.”
“The MMS Summer School was a good way to improve my R skills and my knowledge about statistics. It was nice to meet people from many different fields in a beautiful setting. I would recommend this to anyone interested in statistics. I had a great time and was very happy with the course.”
“I enjoyed the summer school very much. It was a fun, enthusiastic, diverse and bright group. During the day we had tutorials, lectures and group work. In the evenings many engaged in bowling, hikes in the beautiful nature of the black forest, games or conversations. Some students presented their project at the summer school and I was surprised how difficult they are in terms of the statistical methodology needed. It showed me how important it is for researchers from a variety of subjects (projects ranged from cell research to research on air pollution) to have access to statistical methods and support. I hope that we - the tutors - were able to help with solving some of the issues and empowered the students to tackle further projects. My part at the summer school was to teach R, Git, RMarkdown and good scientific practice. Also I supported one group with their project on clustering of cells.”
“The MMS Summer School was the first summer school during my PhD phase. It was a great opportunity to get started with R and Git in an early phase of the PhD. They are both very useful tools for my research. Furthermore, it was great to meet many people from different fields. The summer school was perfectly organized and took place in the beautiful location of MFO in the mid of the black forest.”
“The MMS Summer School provided an excellent opportunity to learn a lot of statistics and R programming in a wonderful setting with smart, friendly people. I hope there will be more opportunities like it in the future.”
“The MMS Summer School is probably one of the most useful programs I have ever attended. I can say there was a significant knowledge transfer that occurred there for me as I was lucky enough to have my data chosen to be worked on during the workshop. The strategy of working on real life data from the participants also took the learning up a notch as it tailored the learning according to our needs. Not only did I learn a wide variety of skills and statistical methods from our awesome lecturers, I was also able to connect with people and start project collaborations. I definitely enjoyed the MMS Summer School, not only because of the program itself, but also because of the entire group, lecturers and participants alike, which was so much fun to work and hang out with.”
Overview on the investigated data problems provided by the participants:
- Clustering of cells - mass cytometry data (problem and data provided by Marie Urbicht from DRFZ),
- Preprocessing and analysis of Raman spectroscopy data (provided by Jing Huang from IPHT),
- Modeling simultaneous measurements of indoor and outdoor particle concentrations (provided by Jiangyue Zhao from TROPOS),
- Sensitivity of high latitude winds to equatorial ionospheric dynamics (provided by Jerry Czarnecki from IAP),
- Air pollution data collected while walking the streets of Leipzig (provided by Honey Dawn Alas from TROPOS).