Topics in Statistical Modelling

Course Description: 

Traditional – frequentists – approaches to statistical analysis are dominant in the social sciences, yet they are often remarkably poorly suited for the aim of learning about the world with the help of quantitative data. The fundamental ideas of frequentist statistics are so alien to human intuition that even some of the best trained social scientists routinely make mistakes about the interpretation of p-values. Frequentist statistics also struggle to offer valuable insights when the number of observations is low, and makes quantifying uncertainty across multiple levels such an arduous exercise most researchers simply choose to ignore it.

 

Recent advances in Bayesian statistics address these limitations so effectively that in many areas (e.g. in industry, and STEMs), Bayes is becoming the dominant approach for data analysis. This course seeks to offer a gentle introduction to Bayesian statistical theory and data analysis. While Bayesian statistics are famous for making incredibly complex models possible (e.g. with more parameters than observations), the course does not seek to teach highly advanced models, but to redefine how we think about and perform statistical analysis. This means we will ponder about the mechanics of (Bayesian) data analysis, about sampling and inference before we fit our first linear model. The good news is that this preparatory work gives us an intuitive framework, where advancing from OLS to generalized linear models (e.g. logistic, Poisson) to multilevel models becomes (dare I say) easy.

 

The course should be valuable to anyone who is hoping to get a better intuitive sense of basic quantitative data analysis; who struggles with the problem of low sample sizes; who seeks to broaden their analytical repertoire with generalized or multilevel regression models; or who wants to get serious about quantifying uncertainty.

 

The course will closely follow Richard McElreath’s phenomenal book called Statistical Rethinking. Using a blended learning approach, the students are expected to read a book chapter and to watch a corresponding lecture from McElreath prior to class. Thereby, class activities will be devoted to clarifying confusions, discussing implications and practicing applications of the content. The course will use the “rethinking” software package in R, although translations to alternative programming languages (e.g. PyMC3 & brms & Julia) are freely available on the internet. The course offers an introduction on applied Bayesian data analysis, thus more advanced problems such as building custom models in STAN, or the computational intricacies of MCMC simulation etc. will be ignored. At the same time, we will see how Bayesian ideas are applied in prominent political science publications.

Prerequisites: 

The course is open to all MA and PHD students, who have completed at least one MA level course in statistics. Prior experience in R is not necessary but definitely helpful. If you plan to follow the course without prior R experience or in another language, please appreciate that you take an extra burden on yourself.