DS 300 Data Mining

This project-based course aims to cover the fundamental concepts and core techniques for discovering patterns in large-scale data sets. This course consists of three main modules: (1) Data Mining Pipeline, which introduces the key steps of data understanding, data preprocessing, data warehousing, data modeling and interpretation/evaluation; (2) Data Mining Methods, which covers core techniques for regression, classification, clustering, dimensionality reduction and association; and (3) Deep Learning, which discusses the state-of-art deep learning techniques such as CNN and RNN with the implementation in Tensorflow.

Instructor:

Dr. Peilong Li

Office:

Esbenshade 284B

Appointments:

By email

Number of Credits

4

Pre-requisites

  • DS 200 Introduction to Data science
  • CS 309 Database Systems

Textbooks

  • (Required) Aurelien Geron. Hands-on Machine Learning with Scikit-Learn, Keras & Tensorflow. 2nd Edition. 2019, O’Reilly. ISBN-13: 978-1-492-0326-9