This course consists of one weekly lecture from 10:10 to 12:40 on Fridays in 310 Fayerweather Hall.

Topics will cover material from several books, including:

Here is a tenative schedule of topics:

Date Topics & tools Readings Materials
2017-01-20 Introduction / Overview   Slides
2017-01-27 Introduction to Counting

Command line (bash, awk, grep, etc.)
  Slides
Notes
Code
2017-02-03 Computational Complexity

dyplr
R4DS Ch 1, 2, 5 Slides
Notes
Code
2017-02-10 Counting at Scale: MapReduce

joins, tidyr, hadoop
R4DS Ch 12 & 13
Dean & Ghemawhat (2008)
Slides
Notes
Code
2017-02-17 Data Visualization

ggplot2
R4DS Ch 3, 7 & 28 Slides
Notes
Code
2017-02-24 Regression I: Theory and Practice

lm, modelr
R4DS Ch 23 & 24
ISL Ch 3
ADA Ch 1 & 2
Slides
Notes
Code
2017-03-03 Regression II: Theory and Practice

lm (cont’d)
ISL Ch 2 & 5
ADA Ch 3
Slides
Notes
Code
2017-03-10 Classification I: Naive Bayes ISL Ch 4.1-4.2
Lewis (1998)
Slides
Notes
Code
2017-03-17 Spring Break    
2017-03-24 Classification II: Logistic Regression ISL Ch 4.3
ADA Ch 12
Notes
Code
2017-03-31 Networks I: Representations, characteristics NCM Ch 2 & 3 Slides
Notes
Code
2017-04-07 Networks II: Counting on graphs NCM Ch 18 & 20 Slides
Notes
Code
2017-04-14 Causality and Experiments: I    
2017-04-21 Causality and Experiments: II    
2017-04-28 Student Presentations    

Your grade will be determined by: