Machine Learning II
SURV 753

Summer 2020

Prerequisites: Prerequisites: SURV 751 and familiarity with R.


Social scientists and survey researchers are confronted with an increasing number of new data sources such as apps and sensors that often result in (para)data structures that are difficult to handle with traditional modeling methods. At the same time, advances in the field of machine learning (ML) have created an array of flexible methods and tools that can be used to tackle a variety of modeling problems. Against this background, this course discusses advanced ML concepts such as cross-validation, class imbalance, Boosting, and Stacking as well as key approaches for facilitating model tuning and performing feature selection. In this course, we also introduce additional machine learning methods including Support Vector Machines, Extra-Trees, and LASSO among others. The course aims to illustrate these concepts, methods, and approaches from a social science perspective. Furthermore, the course covers techniques for extracting patterns from unstructured data as well as interpreting and presenting results from machine learning algorithms. Code examples will be provided using the statistical programming language R.