Course Schedule
Note this tentative schedule is subject to change. Please check the schedule regularly.
Last Updated: 01/13/2025
Week | Starting | Topics | Reading | Notes |
---|---|---|---|---|
1 | 1/13/25 | M: Lec1 - Course intro; Big Data | Week 1 | Mon: HW1 out |
W: Lec2 - Hadoop Ecosystem | ||||
F: Lec3 - HDFS | ||||
2 | 1/20/25 | M: MLK Day; No Classes | Week 2 | |
W: Lec4 - MapReduce | Tue: HW1 due | |||
F: Dr. Li on travel; No Classes | ||||
3 | 1/27/25 | M: PE1 - MapReduce Exercise | Week 3 | Mon: HW2 out |
W: Lab 1 | Team formation due | |||
F: Lec5 - Apache Pig | ||||
4 | 2/3/25 | M: PE2 - Pig Exercise | Week 4 | |
W: Lab 2 | ||||
F: Lec6 - Apache Spark | Finalize project topic and dataset | |||
5 | 2/10/25 | M: Lec7 - Spark deployment on GCP | Week 5 | Mon: HW2 due; HW3 out |
W: Lec8 - Spark Low-level API | ||||
F: Lec9 - Spark DataFrame | ||||
6 | 2/17/25 | M: PE3 - Spark DataFrame | Week 6 | |
W: Lab 3 | ||||
F: Lec10 - Module 1 summary | ||||
7 | 2/24/25 | M: Exam 1 | Mon: HW3 due | |
W: Dr. Li on travel; No Classes | ||||
F: Dr. Li on travel; No Classes | Fri: Proposal due | |||
8 | 3/3/25 | M: Spring Break; No Classes | ||
W: Spring Break; No Classes | ||||
F: Spring Break; No Classes | ||||
9 | 3/10/25 | M: Lec11 - Spark Machine Learning | Week 9 | Mon: HW4 out |
W: Lec12 - Spark Linear Regression | ||||
F: Lec13 - Spark Logistic Regression | ||||
10 | 3/17/25 | M: Lec14 - Spark Tree Classifiers | Week 10 | |
W: PE4 - Spark ML | ||||
F: Lab 4 | ||||
11 | 3/24/25 | M: Lec15 - Spark KMeans | Week 11 | Mon: Project milestone 1 due (40%) |
W: Lec16 - NLP at Scale | ||||
F: Lec17 - NLP with Spark | ||||
12 | 3/31/25 | M: PE5 - ML at Scale | Mon: HW4 due | |
W: Lab 5 | ||||
F: Exam 2 | ||||
13 | 4/7/25 | M: Paper Presentation: Kafka | Paper 1, 2 | Mon: HW 5 out |
W: Paper Presentation: Spark Streaming | ||||
F: PE6 - Data Streaming | ||||
14 | 4/14/25 | M: Lab 6 | Paper 3, 4 | Mon: Project milestone 2 due (80%) |
W: Paper Presentation: NoSQL Overview | ||||
F: Easter Break; No Classes | ||||
15 | 4/21/25 | M: Paper Presentation: Hbase | Paper 5, 6 | Mon: HW 5 due |
T: SCAD Day: Volunteer Presentation | ||||
W: Paper Presentation: DynamoDB | ||||
F: Paper Presentation: Cassandra | ||||
16 | 4/28/25 | M: Project Day; No Classes | ||
W: Project Presentation Group 1 | Wed by noon: presentation and demo due | |||
F: Project Presentation Group 2 | ||||
17 | 5/5/25 | M: Project Group 3: 2:30 - 5:30 PM | ||
W: Have a great summer break! | Wed: Final report and revised code due | |||
F: Have a great summer break! |
Reading List
Paper List
Paper # | Topic | URL |
---|---|---|
Paper 1 | Apache Kafka | https://notes.stephenholiday.com/Kafka.pdf |
Paper 2 | Spark Streaming | https://dl.acm.org/doi/10.1145/2517349.2522737 |
Paper 3 | NoSQL Databases Review | https://dl.acm.org/doi/10.1145/1978915.1978919 |
Paper 4 | Apache Hbase | https://dl.acm.org/doi/10.1145/1365815.1365816 |
Paper 5 | DynamoDB | https://dl.acm.org/doi/10.1145/1323293.1294281 |
Paper 6 | Apache Cassandra | https://dl.acm.org/doi/10.1145/1773912.1773922 |