Joshua Blumenstock

INFO251 Spring 2021
(all readings and lecture recordings are available on bCourses)

January 19: Introduction

Introductions
Nuts and bolts of the class: structure, homework, policies, learning objectives

Required Readings (students are already expected to have this level of familiarity with Python):

Chapters 3-5 and Chapter 9 of McKinney (2013): Python for Data Analysis. O’Reilly Media, Inc.
Install python, IPython, and the numerical analysis libraries on your laptop and bring it to class. I highly recommend you install the Anaconda version, but if you want to assemble the packages yourself, make sure you have python, ipython notebook, numpy, scipy, and matplotlib
Read and complete at least the "Introduction" to the following Python tutorial: http://interactivepython.org/courselib/static/pythonds/index.html
Watch 10-minute tour of pandas: https://www.youtube.com/watch?v=dcqPhpY7tWk
Strongly recommended: Read and complete lessons 1-7 of Learn Pandas (https://bitbucket.org/hrojas/learn-pandas)

January 20 (Lab): NO LAB TODAY

There is no lab section on Jan 22, please do not show up!

January 21: Experimental Methods for Causal Inference

A-B testing, Business Experiments, Randomized Control Trials
Counterfactuals and Control Groups
Correlation and Causation
Experimental design and statistical power

Required Readings:

Chapters 2-3 of Khandker et al. (2010), “Handbook on Impact Evaluation”
Introduction (pp. 263-269) to: Bertrand et al. (2012) “What's advertising content worth? Evidence from a consumer credit marketing field experiment” Quarterly Journal of Economics, 125(11) pp. 263-269

Optional Readings:

Pages 1-47 of: Duflo, M. Kremer and R. Glennerster (2006). "Using Randomization in Development Economics Research: A Toolkit"
Athey & Imbens (2016) The econometrics of Randomized Experiments
Lin, M., Lucas, H.C., Shmueli, G., 2013. Research Commentary—Too Big to Fail: Large Samples and the p-Value Problem. Information Systems Research 24, 906–917. doi:10.1287/isre.2013.0480
Anderson & Simester (2011). “A Step-By-Step Guide to Smart Business Experiments”, Harvard Business Review, pp. 99-105
Ariely (2004). “Why Businesses Don’t Experiment”, Harvard Business Review, p. 34
Kohavi, R., Longbotham, R., Sommerfield, D. and Henne, R. Controlled experiments on the Web: Survey and practical guide. Data Mining and Knowledge Discovery 18 (2009), 140–181.
Reiley, D., Rao, J.M. & Lewis, R.A. (2011) Here, There, and Everywhere: Correlated Online Behaviors Can Lead to Overestimates of the Effects of Advertising. WWW 2011.

January 26: Impact Evaluation

Research designs for impact evaluation
Identifying assumptions
Differences-in-Difference

Required Readings:

Sections 1-3 of Shultz: School subsidies for the poor
Varian, H.R., 2016. Causal inference in economics and marketing. PNAS 113, 7310–7315. doi:10.1073/pnas.1510479113

Optional readings

David Albouy: Lecture notes on Differences in Differences Estimation
Lewis, R., Rao, J.M. & Reiley, D.H. (2015) Measuring the effects of advertising: The digital frontier . In: Economic Analysis of the Digital Economy . University of Chicago Press. pp. 191–218.
Jensen, R., 2007. The Digital Provide: Information (Technology), Market Performance, and Welfare in the South Indian Fisheries Sector. The Quarterly Journal of Economics 122, 879–924.

January 27 (Lab): Python and Pandas

Programming paradigms
Working with data
Crash course in python

January 28: Regression and Impact Evaluation

Regression and causal inference
Interactions and heterogeneity
Fixed and random effects

Required Readings:

Chapter 5 of Khandker et al. (2010), “Handbook on Impact Evaluation”
Lecture notes on “Fixed Effects Models”

Optional Readings:

A more systematic treatment: Gerber, A.S., Green, D.P., 2012. Field Experiments: Design, Analysis, and Interpretation. W. W. Norton & Company, New York.

February 2: Non-Experimental Methods for Causal Inference

Instrumental Variables

Required Readings:

Chapter 6 of Khandker (2010), “Handbook on Impact Evaluation”

Optional Readings

Chapter 10 of Stock & Watson (2010) on “Instrumental Variables”
Angrist, J.; Krueger, A. (2001). "Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments". Journal of Economic Perspectives 15(4): 69–85.
Duflo (2001). Schooling and Labor Market Consequences of School Construction in Indonesia: Evidence from an Unusual Policy Experiment
A more systematic treatment: Kennedy, P., 2008. A Guide to Econometrics. 6 edition. ed. Wiley-Blackwell, Malden, MA.
Alexandre Belloni, Victor Chernozhukov, and Christian Hansen (2011): “LASSO Methods for Gaussian Instrumental Variables Models,” 2011 arXiv:[stat.ME], http://arxiv.org/abs/1012.1297.
Jason Hartford, Greg Lewis, Kevin Leyton-Brown, Matt Taddy (2017): “Deep IV: A Flexible Approach for Counterfactual Prediction.” Proceedings of the 34th International Conference on Machine Learning, PMLR 70, 1414–1423

February 3 (Lab): Regression and Hypothesis Testing

T-tests and regressions with Python
Dummy variables, interactions, fixed effects
Fixed effects
Interaction terms
Instrumental variables

February 4: Non-Experimental Methods for Causal Inference, continued

Regression discontinuity

Required Readings:

Chapter 7 of Khandker (2010), “Handbook on Impact Evaluation”

Optional Readings

Read a simplified example RD analysis in Python
Buddlemeyer & Skoufias (2004). An Evaluation of the Performance of Regression Discontinuity Design on PROGRESA.
Solis, A., 2017. Credit Access and College Enrollment. Journal of Political Economy 125, 562–622. doi:10.1086/690829

February 9: Intro to Machine Learning

Supervised and unsupervised learning
Representation
Evaluation
Optimization
Generalization and overfitting
Training and test data
Cross-validation and bootstrapping
Evaluation and baselines
Features and feature selection

Required Readings:

Chapters 1 & 2 of Daume (in preparation). A course in machine learning
Chapter 5 of Witten, Frank, Hall: Data Mining

Optional Readings:

Mullainathan, S., Spiess, J., 2017. Machine Learning: An Applied Econometric Approach. Journal of Economic Perspectives 31, 87–106. https://doi.org/10.1257/jep.31.2.87
Syed, A. (2011). A review of cross validation and adaptive model selection.

February 10 (Lab): Computational Efficiency

Vectorized computation

February 11: Nearest Neighbors

Instance-based learning
Nearest neighbors
Curse of dimensionality

Required Readings:

Chapter 3 of Daume (in preparation). A course in machine learning

Optional Readings:

Chapter 13 (sections 13.1 - 13.3) of Hastie, Tibshirani, Friedman, The Elements of Statistical Learning
Chapter 6 of Provost & Fawcett: Data Science for Business

February 16: Gradient Descent

Cost functions
Gradient descent
Convexity

Required Readings:

Chapter 7 of Daume (in preparation). A course in machine learning

Optional Readings:

Chapter 5 of Schutt & O’Neill (2013): Doing Data Science

February 17 (Lab): ML Experiments in Python

Random numbers, training and test data
Built-in methods for cross validation
Comparing different measures of performance

February 18: Regularization and Linear Models, part 1

Regularization
Ridge and Lasso
Logistic regression
Support vector machines
Kernel methods

Required Readings:

Chapter 7 of Daume (in preparation). A course in machine learning

Optional Readings:

Chapter 6 (section 6.2) of James et al. (2016): Introduction to Statistical Learning
This post on interpreting logistic regression results
Chapter 3 (sections 3.3 and 3.4) of Hastie, Tibshirani, Friedman, The Elements of Statistical Learning

February 23: Regularization and Linear Models, part 2

Regularization
Ridge and Lasso
Logistic regression
Support vector machines
Kernel methods

Same readings as above

February 24 (Lab): Linear models and Regularization

Lasso vs. Ridge
Cross-validation to find optimal regularization parameter
Computational efficiency revisited

February 25: Naive Bayes

Probability review: Bayes rule, independence, distributions
Generative models and Naive Bayes
Maximum likelihood estimation and smoothing

Required Readings:

Chapter 4 of Schutt & O’Neill (2013): Doing Data Science
Reread section 4.2 of Whitten, Frank, Hall: Data Mining
Michael Collin’s lecture notes on Naïve Bayes (especially pp. 1-4)

Optional Readings:

Paul Graham (2002) on “Better Bayesian Filtering”.
Kevin Murphy's example of Bayes' Rule for medical diagnosis

March 2: Mid-Semester Quiz

Quiz #1

March 3 (Lab): Gradient descent (continued)

Gradient descent
Naive bayes

March 4: Decision Trees

Building decision trees
Information gain

Required Readings:

Chapter 8 of James et al. (2016): Introduction to Statistical Learning
Chapters 13 of Daume (in preparation). A course in machine learning

Optional Readings:

Chapter 9 (section 9.2) and Chapter 15 of Hastie, Tibshirani, Friedman, The Elements of Statistical Learning (10^th edition)

March 9: Random Forests

Regression Trees
Random Forests
Boosting
Feature Importance

Optional Readings:

Feature importance measures for random forest: blog post
A Kaggle master explains gradient boosting

March 10 (Lab): Neural networks

Intro to TensorFlow

March 11: Neural Networks, part 1

Biological underpinnings
The perceptron
Rosenblatt's algorithm

Required Readings:

Chapters 4 and 10 of Daume (in preparation). A course in machine learning

Optional Readings:

Chapter 11 (sections 11.3-11.4) of Hastie, Tibshirani, Friedman, The Elements of Statistical Learning

March 16: Neural Networks, part 2

Multilayer networks
Backpropagation

Required Readings:

Chapters 4 and 10 of Daume (in preparation). A course in machine learning

Optional Readings:

Chapter 11 (sections 11.3-11.4) of Hastie, Tibshirani, Friedman, The Elements of Statistical Learning
We will review these videos by Grant Sanderson on backpropagation in class:
- https://www.youtube.com/watch?v=Ilg3gGewQ5U
- https://www.youtube.com/watch?v=tIeHLnjs5U8

March 17 (Lab): Deep Learning

Naive Bayes

March 18: Deep Learning, part 1

What is "deep" about deep learning?
Auto-encoders
Convolutional Neural Networks
RNNs / LTSM Networks

Required Readings:

Andrew Ng's lecture notes on sparse autoencoders
UFLDL's Deep Learning tutorial

Optional Readings:

--- SPRING BREAK ---

March 30: Bias in ML

High-profile ML failures
Sources of bias
Notions of fairness

Required Readings:

Obermeyer, Powers, Vogeli and Mullainathan. 2019. Dissecting racial bias in an algorithm used to manage the health of populations. https://science.sciencemag.org/content/366/6464/447

Optional Readings

Solon Barocas, Moritz Hardt, Arvind Narayanan. 2020. Fairness and machine learning:Limitations and Opportunities. https://fairmlbook.org (Chapters 1,2, and 5)

March 31: Fair ML lab

Download lab 1 here: https://colab.research.google.com/drive/1yYHoLqbM5in4T801mQ083XGpRqHwtFhm
Download lab 2 here: https://colab.research.google.com/drive/1-kMXYl7LPX1qTnBdBi-ueYx15umPzM24

April 1: Fair ML

Formalization
Identifying bias
Fairness constraints
Technical "solutions"

Required Readings:

Reading: Mulligan, Kroll, Kohli & Wong. 2019. This Thing Called Fairness: Disciplinary Confusion Realizing a Value in Technology. https://dl.acm.org/doi/10.1145/3359221

April 6: Common practical issues

Bias-variance tradeoff
Feature engineering
Imbalanced data

Required Readings:

Chapters 5 & 6 of Daume (in preparation). A course in machine learning

Optional Readings

A plain-English tutorial on the bias-variance tradeoff
Chapters 1-3 of Mastering Feature Engineering (early release)
Chapter 2 of James et al. (2017). An Introduction to statistical Learning
Andrew Gelman on Missing Data Imputation
He, H., Garcia, E.A., 2009. Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering 21, 1263–1284. doi:10.1109/TKDE.2008.239
Lakkaraju, H., Kleinberg, J., Leskovec, J., Ludwig, J., Mullainathan, S., 2017. The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables. KDD 2017, 275–284. https://doi.org/10.1145/3097983.3098066

April 7 (Lab): Supervised learning practicalities

April 8: Common practical issues (Part 2)

Imbalanced data
Missing data
Multi-class classification
Model and feature selection

Required reading

He, H., Garcia, E.A., 2009. Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering 21, 1263–1284. doi:10.1109/TKDE.2008.239

Optional reading:

Python tutorial on Cost-Sensitive Decision Trees for Imbalanced Classification
Python tutorial on softmax classification
Multinomial response models, from Rodríguez, G. (2007). Lecture Notes on Generalized Linear Models.

April 13: Supervised Learning Wrap-Up

Modelling Trade-Offs
Comparing classifiers
Guiding principles

Required Readings:

Chapter 13 of Daume (in preparation). A course in machine learning
Domingos, “ A Few Useful Things to Know about Machine Learning .” Communications of the ACM, 55 (10), 78-87, 2012.

Optional Readings:

Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.-H., Steinbach, M., Hand, D.J., Steinberg, D., 2008. “ Top 10 Algorithms in Data Mining ”. Knowledge and Information Systems 14, 1–37. doi:10.1007/s10115-007-0114-2

April 14 (Lab): TBD

April 15: Unsupervised learning

Cluster analysis
Dimensionality Reduction
Principal Component Analysis
Case study: Eigenfaces
Other methods for dimensionality reduction: SVD, NNMF, LDA

Required Readings

Chapter 7 of Leskovec, Rajaraman, and Ullman (2014): Mining of Massive Datasets

Optional Readings

Watch Pedro Domingos talk about the curse of dimensionality . (segment 4 of week 4)
Chapter 11 (sections 11.1 – 11.3) Leskovec, Rajaraman, and Ullman (2014): Mining of Massive Datasets.
Chapter 15 of Daume (in preparation). A course in machine learning
Justin Grimmer and Gary King. 2011. “General Purpose Computer-Assisted Clustering and Conceptualization.” Proceedings of the National Academy of Sciences. Copy at http://j.mp/2qzYYj2
Chapter 6 of Provost & Fawcett: Data Science for Business
Chapter 14 (sections 14.2, 14.5 - 14.10) of Hastie, Tibshirani, Friedman, The Elements of Statistical Learning (10^th edition)
Turk & Pentland (1991) “ Eigenfaces for Recognition ”

April 20: Recommender Systems

The Netflix challenge
Content-based methods
Learning features and parameters
Nearest-neighbor collaborative filtering

April 21 (Lab): Unsupervised learning

k-Means clustering
Dimensionality reduction: PCA

April 22: Machine learning and causal inference

ML for measurement
Inference after selection
Selecting among many controls
Selecting among many instruments
Machine learning heterogeneous treatment effects

Required Readings:

Section 4 of: Athey, S., 2018. The impact of machine learning on economics, in: The Economics of Artificial Intelligence: An Agenda. University of Chicago Press, pp. 507–547.
Athey, S., Imbens, G., 2019. Machine Learning Methods Economists Should Know About. arXiv:1903.10075.

Optional Readings:

Athey, S., Imbens, G., 2016. Recursive partitioning for heterogeneous causal effects. PNAS 113, 7353–7360. https://doi.org/10.1073/pnas.1510489113
Athey, S., M. Bayati, N. Doudchenko, G. Imbens, and K. Khosravi (2017) "Matrix Completion Methods for Causal Panel Data Models." http://arXiv.org/abs/1710.10251
Belloni, A., Chernozhukov, V., Hansen, C., 2014. High-Dimensional Methods and Inference on Structural and Treatment Effects. Journal of Economic Perspectives 28, 29–50. https://doi.org/10.1257/jep.28.2.29
Same authors (2011): “LASSO Methods for Gaussian Instrumental Variables Models ,” 2011 arXiv:[stat.ME], http://arxiv.org/abs/1012.1297 .
Chernozhukov, V., Hansen, C., Spindler, M., 2015. Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach. Annual Review of Economics 7, 649–688. https://doi.org/10.1146/annurev-economics-012315-015826
Künzel, S.R., Sekhon, J.S., Bickel, P.J., Yu, B., 2019. Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the National Academy of Sciences 116, 4156–4165.
Sands and Gilchrist (Medium Post): Best of Both Worlds: An Applied Intro to ML For Causal Inference
Taylor, J., Tibshirani, R.J., 2015. Statistical learning and selective inference. Proceedings of the National Academy of Sciences 112, 7629–7634.

Wager, S., Du, W., Taylor, J., Tibshirani, R.J., 2016. High-dimensional regression adjustments in randomized experiments. PNAS 113, 12673–12678.

April 27: Applied ML - start to finish

Data => Features
Training and cross-validation
Evaluating performance
Extensions

Optional Readings:

Blumenstock et al (2015): Predicting Poverty with Mobile Phone Metadata
Aiken et al. (2020): Targeting Development Aid with Machine Learning and Mobile Phone Data: Evidence from an Anti-Poverty Intervention in Afghanistan

April 28 (Lab): No Lab

April 29: Summary

Recap / summary

Quiz #2

INFO251 Spring 2021(all readings and lecture recordings are available on bCourses)

January 19: Introduction

January 20 (Lab): NO LAB TODAY

January 21: Experimental Methods for Causal Inference

January 26: Impact Evaluation

January 27 (Lab): Python and Pandas

January 28: Regression and Impact Evaluation

February 2: Non-Experimental Methods for Causal Inference

February 3 (Lab): Regression and Hypothesis Testing

February 4: Non-Experimental Methods for Causal Inference, continued

February 9: Intro to Machine Learning

February 10 (Lab): Computational Efficiency

February 11: Nearest Neighbors

February 16: Gradient Descent

February 17 (Lab): ML Experiments in Python

February 18: Regularization and Linear Models, part 1

February 23: Regularization and Linear Models, part 2

February 24 (Lab): Linear models and Regularization

February 25: Naive Bayes

March 2: Mid-Semester Quiz

March 3 (Lab): Gradient descent (continued)

March 4: Decision Trees

March 9: Random Forests

March 10 (Lab): Neural networks

March 11: Neural Networks, part 1

March 16: Neural Networks, part 2

March 17 (Lab): Deep Learning

March 18: Deep Learning, part 1

March 30: Bias in ML

March 31: Fair ML lab

April 1: Fair ML

April 6: Common practical issues

April 7 (Lab): Supervised learning practicalities

April 8: Common practical issues (Part 2)

April 13: Supervised Learning Wrap-Up

April 14 (Lab): TBD

April 15: Unsupervised learning

April 20: Recommender Systems

April 21 (Lab): Unsupervised learning

April 22: Machine learning and causal inference

April 27: Applied ML - start to finish

April 28 (Lab): No Lab

April 29: Summary

INFO251 Spring 2021
(all readings and lecture recordings are available on bCourses)