Simultaneous Bayesian analysis of contingency tables in genetic association studies
- Dickhaus, Thorsten
2010 Mathematics Subject Classification
- 62J15 62C10
Genetic association studies lead to simultaneous categorical data analysis. The sample for every genetic locus consists of a contingency table containing the numbers of observed genotype-phenotype combinations. Under case-control design, the row counts of every table are identical and fixed, while column counts are random. The aim of the statistical analysis is to test independence of the phenotype and the genotype at every locus. We present an objective Bayesian methodology for these association tests, utilizing the Bayes factor proposed by Good (1976) and Crook and Good (1980). It relies on the conjugacy of Dirichlet and multinomial distributions, where the hyperprior for the Dirichlet parameter is log-Cauchy. Being based on the likelihood principle, the Bayesian tests avoid looping over all tables with given marginals. Hence, their computational burden does not increase with the sample size, in contrast to frequentist exact tests. Making use of data generated by The Wellcome Trust Case Control Consortium (2007), we illustrate that the ordering of the Bayes factors shows a good agreement with that of frequentist p-values. Furthermore, we deal with specifying prior probabilities for the validity of the null hypotheses, by taking linkage disequilibrium structure into account and exploiting the concept of effective numbers of tests. Application of a Bayesian decision theoretic multiple test procedure to The Wellcome Trust Case Control Consortium (2007) data illustrates the proposed methodology. Finally, we discuss two methods for reconciling frequentist and Bayesian approaches to the multiple association test problem for contingency tables in genetic association studies.
- Stat. Appl. Genet. Mol. Biol., 14:4 (2015), pp. 347--360.