Schedule
This course consists of one weekly lecture from 10:10 to 12:40 on Fridays.
Topics will cover material from several books, all of which are available online:
- R for Data Science (R4DS)
- Bit by Bit: Social Research in the Digital Age (BBB)
- Introduction to Statistical Learning (ISL)
- Advanced Data Analysis from an Elementary Point of View (ADA)
- Networks, Crowds, and Markets (NCM)
- Introduction to Statistical Thinking (IST)
Here is a tenative schedule of topics:
Date | Topics & tools | Readings | Materials |
---|---|---|---|
2019-01-25 | Introduction / Overview | BBB Ch 1 | Slides |
2019-02-01 | Introduction to Counting bash, awk, grep, etc. |
BBB Ch 2 R4DS Ch 1, 4 |
Slides Notes Code |
2019-02-07 | Homework 1 | ||
2019-02-08 | Computational Complexity tidyverse |
R4DS Ch 5, 12, 13 | Notes Code |
2019-02-15 | Data Visualization ggplot2 |
R4DS Ch 3, 7 & 28 | Slides 1, 2 Notes Code 1, 2 |
2019-02-22 | Reproducibility and Replication I Randomization inference |
Gigerenzer 2018 Greenland 2016 |
Slides Notes Code |
2019-02-28 | Homework 2 | ||
2019-03-01 | Reproducibility and Replication II Rmarkdown, Makefiles |
R4DS Ch 27 Ioannidis (2005) Hand (2006) Simmons (2011) |
Slides Notes Code |
2019-03-08 | Regression I: Theory and Practice lm |
R4DS Ch 23 & 24 ISL Ch 3 ADA Ch 1 & 2 |
Slides Notes Code |
2019-03-15 | Regression II: Theory and Practice lm (cont’d), modelr |
ISL Ch 2 & 5 ADA Ch 3 |
Slides Notes Code |
2019-03-22 | Spring Break | ||
2019-03-28 | Homework 3 | ||
2019-03-29 | Classification: Naive Bayes, Logistic Regression glmnet |
ISL Ch 4.1 - 4.3 Lewis (1998) ADA Ch 12 |
Slides Notes Code |
2019-04-05 | Networks I: Representations, characteristics igraph |
NCM Ch 2 & 3 | Slides Notes Code |
2019-04-10 | Homework 4 | ||
2019-04-12 | Networks II: Counting on graphs igraph |
NCM Ch 18 & 20 | Notes Code |
2019-04-19 | Class canceled | BBB Ch 4 Varian (2016) ADA Ch 21 |
|
2019-04-26 | Causality and Experiments | BBB Ch 6 IST Ch 12 & 13 Dunning (2009) |
Slides Notes Code |
2019-05-03 | Student Presentations |
Grading
Your grade will be determined by:
- Four homeworks (60%)
- Final group project (30%)
- Scribed notes (5%)
- Class participation (5%)
Homework is to be submitted electronically and should include all code necessary to solve each problem along with a brief report of your results. All code should be contained in plain text files and should produce the exact results you provide in your writeup. Code should be written in bash / R and should not have complex dependencies on non-standard libraries. Late submissions will be penalized 10 percentage points on the first day and 5 percentage points for each day thereafter.
The final project will be done in groups, and will involve replicating and extending a published research paper. Each group will present its results to the class at the end of the semester.
Each student will also be responsible for scribing notes for one lecture during the semester which will posted to a shared, public repository. Students are expected to attend and participate in all lectures.
Academic rules of conduct
Students are expected to adhere to the APAM Academic Honor Code. You are welcome to discuss course content with other students, but homework should be done individually unless noted otherwise. Students must write and submit their own, original code. Sharing code for individual assignments is prohibited, as is the use of of any existing solutions found online or through other means. Violation of these rules will result in a penalty that may include zero credit for the assignment in question or a failing grade for the course.
Office hours
Office hours will be after class on Fridays or by appointment.