Graduate Program (& Advanced Certificate) Status
. Data Analysis 3 covers the fundamentals of data analysis with the aim of prediction also called predictive analytics. This course equips students with the knowledge and skills necessary to carry out and evaluate predictions in business and policy environments. We focus on select few applications with tabular data, those that offer good performance and are widely used in industry. This course starts with the fundamentals of predictive analytics and covers topics such as variable selection with LASSO, prediction with regressions, probability prediction and classification with binary targets. We cover in depth one area of machine learning models: tree-based models (CART, random forest and gradient boosting). The course will also discuss model independent issues such as sample design and external validity of the results.
Key outcomes. By the end of the course, students will be able to
- Carry out reasonably good predictions and evaluate their performance;
- Evaluate the predictive performance of all kinds of models;
- Build machine learning models with some of most widely used methods such as random forest and boosting.
- Discuss and evaluate results of predictive analysis.
- Present the results of predictive analytics and write short reports;
- Evaluate the merits of presentations and reports that carry out predictive analytics.
Grading will be based on the total score out of 100, in line with CEU Department of Economics and Business grading guidelines. In particular,
- The median student can expect to get a B+
- Probably not more than 1/3 of the students can expect to get an A or A-
- To pass, students will need to get at least 50% of the overall grade.
The final grade is based on:
- start-of-the-class and in-class quizzes/exercises [10%];
- three assignments [90%];
Assignments are data exercises, where students potentially need to gather and clean data, carry out analysis and interpret results. One of the assignment is a small group project.
Data Analysis 1,2 or Introductory Econometrics. Good coding knowledge is expected in R or Python or Stata+some Python.