Course Schedule
Note this tentative schedule is subject to change. Please check the schedule regularly.
Last Updated: 01/13/2025
| Week | Starting | Topics | Reading | Notes |
|---|---|---|---|---|
| 1 | 1/13/25 | M: Lec1 - Course intro; Big Data | Week 1 | Mon: HW1 out |
| W: Lec2 - Hadoop Ecosystem | ||||
| F: Lec3 - HDFS | ||||
| 2 | 1/20/25 | M: MLK Day; No Classes | Week 2 | |
| W: Lec4 - MapReduce | Tue: HW1 due | |||
| F: Dr. Li on travel; No Classes | ||||
| 3 | 1/27/25 | M: PE1 - MapReduce Exercise | Week 3 | Mon: HW2 out |
| W: Lab 1 | Team formation due | |||
| F: Lec5 - Apache Pig | ||||
| 4 | 2/3/25 | M: PE2 - Pig Exercise | Week 4 | |
| W: Lab 2 | ||||
| F: Lec6 - Apache Spark | Finalize project topic and dataset | |||
| 5 | 2/10/25 | M: Lec7 - Spark deployment on GCP | Week 5 | Mon: HW2 due; HW3 out |
| W: Lec8 - Spark Low-level API | ||||
| F: Lec9 - Spark DataFrame | ||||
| 6 | 2/17/25 | M: PE3 - Spark DataFrame | Week 6 | |
| W: Lab 3 | ||||
| F: Lec10 - Module 1 summary | ||||
| 7 | 2/24/25 | M: Exam 1 | Mon: HW3 due | |
| W: Dr. Li on travel; No Classes | ||||
| F: Dr. Li on travel; No Classes | Fri: Proposal due | |||
| 8 | 3/3/25 | M: Spring Break; No Classes | ||
| W: Spring Break; No Classes | ||||
| F: Spring Break; No Classes | ||||
| 9 | 3/10/25 | M: Lec11 - Spark Machine Learning | Week 9 | Mon: HW4 out |
| W: Lec12 - Spark Linear Regression | ||||
| F: Lec13 - Spark Logistic Regression | ||||
| 10 | 3/17/25 | M: Lec14 - Spark Tree Classifiers | Week 10 | |
| W: PE4 - Spark ML | ||||
| F: Lab 4 | ||||
| 11 | 3/24/25 | M: Lec15 - Spark KMeans | Week 11 | Mon: Project milestone 1 due (40%) |
| W: Lec16 - NLP at Scale | ||||
| F: Lec17 - NLP with Spark | ||||
| 12 | 3/31/25 | M: PE5 - ML at Scale | Mon: HW4 due | |
| W: Lab 5 | ||||
| F: Exam 2 | ||||
| 13 | 4/7/25 | M: Paper Presentation: Kafka | Paper 1, 2 | Mon: HW 5 out |
| W: Paper Presentation: Spark Streaming | ||||
| F: PE6 - Data Streaming | ||||
| 14 | 4/14/25 | M: Lab 6 | Paper 3, 4 | Mon: Project milestone 2 due (80%) |
| W: Paper Presentation: NoSQL Overview | ||||
| F: Easter Break; No Classes | ||||
| 15 | 4/21/25 | M: Paper Presentation: Hbase | Paper 5, 6 | Mon: HW 5 due |
| T: SCAD Day: Volunteer Presentation | ||||
| W: Paper Presentation: DynamoDB | ||||
| F: Paper Presentation: Cassandra | ||||
| 16 | 4/28/25 | M: Project Day; No Classes | ||
| W: Project Presentation Group 1 | Wed by noon: presentation and demo due | |||
| F: Project Presentation Group 2 | ||||
| 17 | 5/5/25 | M: Project Group 3: 2:30 - 5:30 PM | ||
| W: Have a great summer break! | Wed: Final report and revised code due | |||
| F: Have a great summer break! |
Reading List
Paper List
| Paper # | Topic | URL |
|---|---|---|
| Paper 1 | Apache Kafka | https://notes.stephenholiday.com/Kafka.pdf |
| Paper 2 | Spark Streaming | https://dl.acm.org/doi/10.1145/2517349.2522737 |
| Paper 3 | NoSQL Databases Review | https://dl.acm.org/doi/10.1145/1978915.1978919 |
| Paper 4 | Apache Hbase | https://dl.acm.org/doi/10.1145/1365815.1365816 |
| Paper 5 | DynamoDB | https://dl.acm.org/doi/10.1145/1323293.1294281 |
| Paper 6 | Apache Cassandra | https://dl.acm.org/doi/10.1145/1773912.1773922 |