Big Data for Federal Agencies: Lab




The amount of digital data generated as a by-product in society is growing fast, e.g., data from satellites, sensors, transactions, administrative processes, social media and smartphones. This type of data is characterized by high volume, high velocity, high variety and is often called big data. The hope is to gain insights from this data for different areas such as e.g., health and crime prevention, planning of infrastructures, and business decisions. Big Data is of interest for agencies that produce statistics to find alternative data sources either to reduce cost, to improve estimates or to produce estimates in a more timely fashion. In particular on the economic statistics side, this interest in growing rapidly. The change in the nature of the new types of data, their availability, the way in which they are collected, and disseminated are fundamental. The change constitutes a paradigm shift for agencies that in the past relied primarily on survey research. However, data quality frameworks well established in statistics production still hold. Paired with a specially designed lecture (SURV699Y), this lab session will allow students to apply all learned techniques through a worked example relevant to core work of the Federal Statistical Agencies. In addition students will work in group projects on topics relevant to their individual agencies. These projects are specifically prepared with Agencies partners for this course. The two core instructors will bridge between computer science and the economic and social science applications.