Schedule
This course consists of one weekly lecture from 10:10 to 12:40 on Fridays.
Topics will cover material from several books, all of which are available online:
 R for Data Science (R4DS)
 Bit by Bit: Social Research in the Digital Age (BBB)
 Introduction to Statistical Learning (ISL)
 Advanced Data Analysis from an Elementary Point of View (ADA)
 Networks, Crowds, and Markets (NCM)
 Introduction to Statistical Thinking (IST)
Here is a tenative schedule of topics:
Date  Topics & tools  Readings  Materials 

20190125  Introduction / Overview  BBB Ch 1  Slides 
20190201  Introduction to Counting bash, awk, grep, etc. 
BBB Ch 2 R4DS Ch 1, 4 
Slides Notes Code 
20190208  Computational Complexity tidyverse 
R4DS Ch 5, 12, 13  
20190215  Data Visualization ggplot2 
R4DS Ch 3, 7 & 28  
20190222  Reproducibility and Replication Rmarkdown 
R4DS Ch 27 Ioannidis (2005) Hand (2006) Simmons (2011) 

20190301  Regression I: Theory and Practice lm, modelr 
R4DS Ch 23 & 24 ISL Ch 3 ADA Ch 1 & 2 

20190308  Regression II: Theory and Practice lm, modelr (cont’d) 
ISL Ch 2 & 5 ADA Ch 3 

20190315  Classification I: Naive Bayes  ISL Ch 4.14.2 Lewis (1998) 

20190322  Spring Break  
20190329  Classification II: Logistic Regression glmnet 
ISL Ch 4.3 ADA Ch 12 

20190405  Networks I: Representations, characteristics igraph, tidygraph 
NCM Ch 2 & 3  
20190412  Networks II: Counting on graphs igraph, tidygraph (cont’d) 
NCM Ch 18 & 20  
20190419  Causality and Experiments: I  BBB Ch 4 Varian (2016) ADA Ch 21 

20190426  Causality and Experiments: II  BBB Ch 6 IST Ch 12 & 13 Dunning (2009) 

20190503  Student Presentations 
Grading
Your grade will be determined by:
 Four homeworks (60%)
 Final group project (30%)
 Scribed notes (5%)
 Class participation (5%)
Homework is to be submitted electronically and should include all code necessary to solve each problem along with a brief report of your results. All code should be contained in plain text files and should produce the exact results you provide in your writeup. Code should be written in bash / R and should not have complex dependencies on nonstandard libraries. Late submissions will be penalized 10 percentage points on the first day and 5 percentage points for each day thereafter.
The final project will be done in groups, and will involve replicating and extending a published research paper. Each group will present its results to the class at the end of the semester.
Each student will also be responsible for scribing notes for one lecture during the semester which will posted to a shared, public repository. Students are expected to attend and participate in all lectures.
Academic rules of conduct
Students are expected to adhere to the APAM Academic Honor Code. You are welcome to discuss course content with other students, but homework should be done individually unless noted otherwise. Students must write and submit their own, original code. Sharing code for individual assignments is prohibited, as is the use of of any existing solutions found online or through other means. Violation of these rules will result in a penalty that may include zero credit for the assignment in question or a failing grade for the course.
Office hours
Office hours will be after class on Fridays or by appointment.