Course Schedule
Part 1: Resources
Week 1
Mon, Sep 2
Labor Day
Week 2
Mon, Sep 9
Deployment (Linux Pipelines)
Read: Designing Data Intensive Applications, Kleppmann ("Batch Processing with Unix Tools" of Chapter 10)
Watch: Lecture
Slides: PDF
Wed, Sep 11
Deployment (Docker)
Released: P1 (Docker)
Watch: Lecture
Slides: PDF
Worksheet: PDF
Anki Flashcards: Deck
Quiz: week 1
Fri, Sep 13
Network Resources (Overview)
Read: Designing Data Intensive Applications, Kleppmann (Chapter 4, "Encoding and Evolution")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 3
Wed, Sep 18
Memory Resources (Caching)
Read: Systems Performance, Gregg (6.2.2; "CPU Caches" and "Latency" subsections of 6.4.1)
Worksheet: PDF
Quiz: week 2 and before (cumulative)
Week 4
Wed, Sep 25
Compute Resources (Threads)
Read: Fluent Python, 2nd Edition ("What's New in This Chapter" through "A Bit of Jargon" in chapter 19, "Concurrency Models in Python")
Worksheet: PDF
Quiz: week 3 and before (cumulative)
Fri, Sep 27
Compute Resources (Locks)
Read: Mastering Concurrency in Python ("Working With Threads In Python" chapter)
Worksheet: PDF
Week 5
Mon, Sep 30
Storage Resources (File Systems)
Wed, Oct 2
Storage Resources (Formats and DBs)
Read: Designing Data Intensive Applications, Kleppmann ("Transaction Processing or Analytics?" and "Column-Oriented Storage" sections of Chapter 3, "Storage and Retrieval")
Due: P2
Released: P3 (Compute+Storage)
Quiz: week 4 and before (cumulative)
Fri, Oct 4
SQL Databases (MySQL)
Read: MySQL Crash Course, Silva (Chapters 3+5), Designing Data-Intensive Applications, Kleppmann ("The Meaning of ACID" section in Chapter 7, "Transactions")
Week 6
Mon, Oct 7
Review
Wed, Oct 9
Midterm (in class)
Quiz: week 5 and before (cumulative)
Fri, Oct 11
HDFS Overview
Part 2: Clusters
Week 7
Mon, Oct 14
HDFS Practice
Read: Mastering Hadoop 3, Singh et al. ("Deep Dive Into the Hadoop Distributed File System" chapter)
Released: P4 (HDFS, Loans)
Due: P3
Wed, Oct 16
MapReduce
Read: Learning Spark, 2nd edition by Damji et al. (sections "The Importance of an Optimal Storage Solution", "Databases", and "Data Lakes" of chapter 9, "Building Reliable Data Lakes with Apache Spark")
Quiz: week 6 and before (cumulative)
Fri, Oct 18
Spark RDDs
Week 8
Mon, Oct 21
Spark DataFrames
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 4, "Spark SQL and DataFrames: Introduction to Built-in Data Sources")
Wed, Oct 23
Spark SQL
Read: Designing Data Intensive Applications, Kleppmann ("Reduce-Side Joins and Grouping" of Chapter 10, "Batch Processing")
Fri, Oct 25
Spark Internals and Performance
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 7, "Optimizing and Tuning Spark Applications")
Due: P4
Released: P5 (Spark, Loans)
Week 9
Mon, Oct 28
Spark Machine Learning
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 10, "Machine Learning with MLlib")
Wed, Oct 30
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (Chapter 4, "The Cassandra Query Language")
Quiz: week 8 and before (cumulative)
Fri, Nov 1
Cassandra Query Language (CQL)
Week 10
Mon, Nov 4
Cassandra Partitioning
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")
Worksheet: PDF
Wed, Nov 6
Cassandra Replication
Due: P5
Released: P6 (Cassandra, Weather)
Quiz: week 9 and before (cumulative)
Fri, Nov 8
Review
Week 11
Mon, Nov 11
Midterm (in class)
Wed, Nov 13
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Quiz: week 10 and before (cumulative)
Fri, Nov 15
Streaming: Kafka Demos
Week 12
Mon, Nov 18
Streaming: Kafka Reliability
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Wed, Nov 20
Streaming: Spark Programming
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 8, "Structured Streaming")
Due: P6
Released: P7 (Kafka, Weather Stations)
Quiz: week 11 and before (cumulative)
Fri, Nov 22
Streaming: Spark Concepts
Part 3: Cloud
Week 13
Mon, Nov 25
The Cloud
Wed, Nov 27
Big Query 1
Quiz: week 12 and before (cumulative)
Fri, Nov 29
Thanksgiving Break
Week 14
Mon, Dec 2
Big Query 2
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Wed, Dec 4
Big Query 3
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Due: P7
Released: P8 (BigQuery, Loans)
Quiz: week 13 and before (cumulative)
Fri, Dec 6
Big Query 4
Week 15
Mon, Dec 9
Cloud Deployment
Wed, Dec 11
Review
Due: P8
Mon, Sep 2
Labor Day
Mon, Sep 9
Deployment (Linux Pipelines)
Read: Designing Data Intensive Applications, Kleppmann ("Batch Processing with Unix Tools" of Chapter 10)Watch: Lecture
Slides: PDF
Wed, Sep 11
Deployment (Docker)
Released: P1 (Docker)Watch: Lecture
Slides: PDF
Worksheet: PDF
Anki Flashcards: Deck
Quiz: week 1
Fri, Sep 13
Network Resources (Overview)
Read: Designing Data Intensive Applications, Kleppmann (Chapter 4, "Encoding and Evolution")Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 3
Wed, Sep 18
Memory Resources (Caching)
Read: Systems Performance, Gregg (6.2.2; "CPU Caches" and "Latency" subsections of 6.4.1)
Worksheet: PDF
Quiz: week 2 and before (cumulative)
Week 4
Wed, Sep 25
Compute Resources (Threads)
Read: Fluent Python, 2nd Edition ("What's New in This Chapter" through "A Bit of Jargon" in chapter 19, "Concurrency Models in Python")
Worksheet: PDF
Quiz: week 3 and before (cumulative)
Fri, Sep 27
Compute Resources (Locks)
Read: Mastering Concurrency in Python ("Working With Threads In Python" chapter)
Worksheet: PDF
Week 5
Mon, Sep 30
Storage Resources (File Systems)
Wed, Oct 2
Storage Resources (Formats and DBs)
Read: Designing Data Intensive Applications, Kleppmann ("Transaction Processing or Analytics?" and "Column-Oriented Storage" sections of Chapter 3, "Storage and Retrieval")
Due: P2
Released: P3 (Compute+Storage)
Quiz: week 4 and before (cumulative)
Fri, Oct 4
SQL Databases (MySQL)
Read: MySQL Crash Course, Silva (Chapters 3+5), Designing Data-Intensive Applications, Kleppmann ("The Meaning of ACID" section in Chapter 7, "Transactions")
Week 6
Mon, Oct 7
Review
Wed, Oct 9
Midterm (in class)
Quiz: week 5 and before (cumulative)
Fri, Oct 11
HDFS Overview
Part 2: Clusters
Week 7
Mon, Oct 14
HDFS Practice
Read: Mastering Hadoop 3, Singh et al. ("Deep Dive Into the Hadoop Distributed File System" chapter)
Released: P4 (HDFS, Loans)
Due: P3
Wed, Oct 16
MapReduce
Read: Learning Spark, 2nd edition by Damji et al. (sections "The Importance of an Optimal Storage Solution", "Databases", and "Data Lakes" of chapter 9, "Building Reliable Data Lakes with Apache Spark")
Quiz: week 6 and before (cumulative)
Fri, Oct 18
Spark RDDs
Week 8
Mon, Oct 21
Spark DataFrames
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 4, "Spark SQL and DataFrames: Introduction to Built-in Data Sources")
Wed, Oct 23
Spark SQL
Read: Designing Data Intensive Applications, Kleppmann ("Reduce-Side Joins and Grouping" of Chapter 10, "Batch Processing")
Fri, Oct 25
Spark Internals and Performance
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 7, "Optimizing and Tuning Spark Applications")
Due: P4
Released: P5 (Spark, Loans)
Week 9
Mon, Oct 28
Spark Machine Learning
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 10, "Machine Learning with MLlib")
Wed, Oct 30
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (Chapter 4, "The Cassandra Query Language")
Quiz: week 8 and before (cumulative)
Fri, Nov 1
Cassandra Query Language (CQL)
Week 10
Mon, Nov 4
Cassandra Partitioning
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")
Worksheet: PDF
Wed, Nov 6
Cassandra Replication
Due: P5
Released: P6 (Cassandra, Weather)
Quiz: week 9 and before (cumulative)
Fri, Nov 8
Review
Week 11
Mon, Nov 11
Midterm (in class)
Wed, Nov 13
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Quiz: week 10 and before (cumulative)
Fri, Nov 15
Streaming: Kafka Demos
Week 12
Mon, Nov 18
Streaming: Kafka Reliability
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Wed, Nov 20
Streaming: Spark Programming
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 8, "Structured Streaming")
Due: P6
Released: P7 (Kafka, Weather Stations)
Quiz: week 11 and before (cumulative)
Fri, Nov 22
Streaming: Spark Concepts
Part 3: Cloud
Week 13
Mon, Nov 25
The Cloud
Wed, Nov 27
Big Query 1
Quiz: week 12 and before (cumulative)
Fri, Nov 29
Thanksgiving Break
Week 14
Mon, Dec 2
Big Query 2
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Wed, Dec 4
Big Query 3
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Due: P7
Released: P8 (BigQuery, Loans)
Quiz: week 13 and before (cumulative)
Fri, Dec 6
Big Query 4
Week 15
Mon, Dec 9
Cloud Deployment
Wed, Dec 11
Review
Due: P8
Wed, Sep 18
Memory Resources (Caching)
Read: Systems Performance, Gregg (6.2.2; "CPU Caches" and "Latency" subsections of 6.4.1)Worksheet: PDF
Quiz: week 2 and before (cumulative)
Wed, Sep 25
Compute Resources (Threads)
Read: Fluent Python, 2nd Edition ("What's New in This Chapter" through "A Bit of Jargon" in chapter 19, "Concurrency Models in Python")Worksheet: PDF
Quiz: week 3 and before (cumulative)
Fri, Sep 27
Compute Resources (Locks)
Read: Mastering Concurrency in Python ("Working With Threads In Python" chapter)Worksheet: PDF
Week 5
Mon, Sep 30
Storage Resources (File Systems)
Wed, Oct 2
Storage Resources (Formats and DBs)
Read: Designing Data Intensive Applications, Kleppmann ("Transaction Processing or Analytics?" and "Column-Oriented Storage" sections of Chapter 3, "Storage and Retrieval")
Due: P2
Released: P3 (Compute+Storage)
Quiz: week 4 and before (cumulative)
Fri, Oct 4
SQL Databases (MySQL)
Read: MySQL Crash Course, Silva (Chapters 3+5), Designing Data-Intensive Applications, Kleppmann ("The Meaning of ACID" section in Chapter 7, "Transactions")
Week 6
Mon, Oct 7
Review
Wed, Oct 9
Midterm (in class)
Quiz: week 5 and before (cumulative)
Fri, Oct 11
HDFS Overview
Part 2: Clusters
Week 7
Mon, Oct 14
HDFS Practice
Read: Mastering Hadoop 3, Singh et al. ("Deep Dive Into the Hadoop Distributed File System" chapter)
Released: P4 (HDFS, Loans)
Due: P3
Wed, Oct 16
MapReduce
Read: Learning Spark, 2nd edition by Damji et al. (sections "The Importance of an Optimal Storage Solution", "Databases", and "Data Lakes" of chapter 9, "Building Reliable Data Lakes with Apache Spark")
Quiz: week 6 and before (cumulative)
Fri, Oct 18
Spark RDDs
Week 8
Mon, Oct 21
Spark DataFrames
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 4, "Spark SQL and DataFrames: Introduction to Built-in Data Sources")
Wed, Oct 23
Spark SQL
Read: Designing Data Intensive Applications, Kleppmann ("Reduce-Side Joins and Grouping" of Chapter 10, "Batch Processing")
Fri, Oct 25
Spark Internals and Performance
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 7, "Optimizing and Tuning Spark Applications")
Due: P4
Released: P5 (Spark, Loans)
Week 9
Mon, Oct 28
Spark Machine Learning
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 10, "Machine Learning with MLlib")
Wed, Oct 30
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (Chapter 4, "The Cassandra Query Language")
Quiz: week 8 and before (cumulative)
Fri, Nov 1
Cassandra Query Language (CQL)
Week 10
Mon, Nov 4
Cassandra Partitioning
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")
Worksheet: PDF
Wed, Nov 6
Cassandra Replication
Due: P5
Released: P6 (Cassandra, Weather)
Quiz: week 9 and before (cumulative)
Fri, Nov 8
Review
Week 11
Mon, Nov 11
Midterm (in class)
Wed, Nov 13
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Quiz: week 10 and before (cumulative)
Fri, Nov 15
Streaming: Kafka Demos
Week 12
Mon, Nov 18
Streaming: Kafka Reliability
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Wed, Nov 20
Streaming: Spark Programming
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 8, "Structured Streaming")
Due: P6
Released: P7 (Kafka, Weather Stations)
Quiz: week 11 and before (cumulative)
Fri, Nov 22
Streaming: Spark Concepts
Part 3: Cloud
Week 13
Mon, Nov 25
The Cloud
Wed, Nov 27
Big Query 1
Quiz: week 12 and before (cumulative)
Fri, Nov 29
Thanksgiving Break
Week 14
Mon, Dec 2
Big Query 2
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Wed, Dec 4
Big Query 3
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Due: P7
Released: P8 (BigQuery, Loans)
Quiz: week 13 and before (cumulative)
Fri, Dec 6
Big Query 4
Week 15
Mon, Dec 9
Cloud Deployment
Wed, Dec 11
Review
Due: P8
Mon, Sep 30
Storage Resources (File Systems)
Wed, Oct 2
Storage Resources (Formats and DBs)
Read: Designing Data Intensive Applications, Kleppmann ("Transaction Processing or Analytics?" and "Column-Oriented Storage" sections of Chapter 3, "Storage and Retrieval")Due: P2
Released: P3 (Compute+Storage)
Quiz: week 4 and before (cumulative)
Fri, Oct 4
SQL Databases (MySQL)
Read: MySQL Crash Course, Silva (Chapters 3+5), Designing Data-Intensive Applications, Kleppmann ("The Meaning of ACID" section in Chapter 7, "Transactions")Mon, Oct 7
Review
Wed, Oct 9
Midterm (in class)
Quiz: week 5 and before (cumulative)
Fri, Oct 11
HDFS Overview
Part 2: Clusters
Week 7
Mon, Oct 14
HDFS Practice
Read: Mastering Hadoop 3, Singh et al. ("Deep Dive Into the Hadoop Distributed File System" chapter)
Released: P4 (HDFS, Loans)
Due: P3
Wed, Oct 16
MapReduce
Read: Learning Spark, 2nd edition by Damji et al. (sections "The Importance of an Optimal Storage Solution", "Databases", and "Data Lakes" of chapter 9, "Building Reliable Data Lakes with Apache Spark")
Quiz: week 6 and before (cumulative)
Fri, Oct 18
Spark RDDs
Week 8
Mon, Oct 21
Spark DataFrames
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 4, "Spark SQL and DataFrames: Introduction to Built-in Data Sources")
Wed, Oct 23
Spark SQL
Read: Designing Data Intensive Applications, Kleppmann ("Reduce-Side Joins and Grouping" of Chapter 10, "Batch Processing")
Fri, Oct 25
Spark Internals and Performance
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 7, "Optimizing and Tuning Spark Applications")
Due: P4
Released: P5 (Spark, Loans)
Week 9
Mon, Oct 28
Spark Machine Learning
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 10, "Machine Learning with MLlib")
Wed, Oct 30
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (Chapter 4, "The Cassandra Query Language")
Quiz: week 8 and before (cumulative)
Fri, Nov 1
Cassandra Query Language (CQL)
Week 10
Mon, Nov 4
Cassandra Partitioning
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")
Worksheet: PDF
Wed, Nov 6
Cassandra Replication
Due: P5
Released: P6 (Cassandra, Weather)
Quiz: week 9 and before (cumulative)
Fri, Nov 8
Review
Week 11
Mon, Nov 11
Midterm (in class)
Wed, Nov 13
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Quiz: week 10 and before (cumulative)
Fri, Nov 15
Streaming: Kafka Demos
Week 12
Mon, Nov 18
Streaming: Kafka Reliability
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Wed, Nov 20
Streaming: Spark Programming
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 8, "Structured Streaming")
Due: P6
Released: P7 (Kafka, Weather Stations)
Quiz: week 11 and before (cumulative)
Fri, Nov 22
Streaming: Spark Concepts
Part 3: Cloud
Week 13
Mon, Nov 25
The Cloud
Wed, Nov 27
Big Query 1
Quiz: week 12 and before (cumulative)
Fri, Nov 29
Thanksgiving Break
Week 14
Mon, Dec 2
Big Query 2
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Wed, Dec 4
Big Query 3
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Due: P7
Released: P8 (BigQuery, Loans)
Quiz: week 13 and before (cumulative)
Fri, Dec 6
Big Query 4
Week 15
Mon, Dec 9
Cloud Deployment
Wed, Dec 11
Review
Due: P8
Mon, Oct 14
HDFS Practice
Read: Mastering Hadoop 3, Singh et al. ("Deep Dive Into the Hadoop Distributed File System" chapter)Released: P4 (HDFS, Loans)
Due: P3
Wed, Oct 16
MapReduce
Read: Learning Spark, 2nd edition by Damji et al. (sections "The Importance of an Optimal Storage Solution", "Databases", and "Data Lakes" of chapter 9, "Building Reliable Data Lakes with Apache Spark")Quiz: week 6 and before (cumulative)
Fri, Oct 18
Spark RDDs
Mon, Oct 21
Spark DataFrames
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 4, "Spark SQL and DataFrames: Introduction to Built-in Data Sources")Wed, Oct 23
Spark SQL
Read: Designing Data Intensive Applications, Kleppmann ("Reduce-Side Joins and Grouping" of Chapter 10, "Batch Processing")Fri, Oct 25
Spark Internals and Performance
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 7, "Optimizing and Tuning Spark Applications")Due: P4
Released: P5 (Spark, Loans)
Week 9
Mon, Oct 28
Spark Machine Learning
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 10, "Machine Learning with MLlib")
Wed, Oct 30
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (Chapter 4, "The Cassandra Query Language")
Quiz: week 8 and before (cumulative)
Fri, Nov 1
Cassandra Query Language (CQL)
Week 10
Mon, Nov 4
Cassandra Partitioning
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")
Worksheet: PDF
Wed, Nov 6
Cassandra Replication
Due: P5
Released: P6 (Cassandra, Weather)
Quiz: week 9 and before (cumulative)
Fri, Nov 8
Review
Week 11
Mon, Nov 11
Midterm (in class)
Wed, Nov 13
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Quiz: week 10 and before (cumulative)
Fri, Nov 15
Streaming: Kafka Demos
Week 12
Mon, Nov 18
Streaming: Kafka Reliability
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Wed, Nov 20
Streaming: Spark Programming
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 8, "Structured Streaming")
Due: P6
Released: P7 (Kafka, Weather Stations)
Quiz: week 11 and before (cumulative)
Fri, Nov 22
Streaming: Spark Concepts
Part 3: Cloud
Week 13
Mon, Nov 25
The Cloud
Wed, Nov 27
Big Query 1
Quiz: week 12 and before (cumulative)
Fri, Nov 29
Thanksgiving Break
Week 14
Mon, Dec 2
Big Query 2
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Wed, Dec 4
Big Query 3
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Due: P7
Released: P8 (BigQuery, Loans)
Quiz: week 13 and before (cumulative)
Fri, Dec 6
Big Query 4
Week 15
Mon, Dec 9
Cloud Deployment
Wed, Dec 11
Review
Due: P8
Mon, Oct 28
Spark Machine Learning
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 10, "Machine Learning with MLlib")Wed, Oct 30
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (Chapter 4, "The Cassandra Query Language")Quiz: week 8 and before (cumulative)
Fri, Nov 1
Cassandra Query Language (CQL)
Mon, Nov 4
Cassandra Partitioning
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")Worksheet: PDF
Wed, Nov 6
Cassandra Replication
Due: P5Released: P6 (Cassandra, Weather)
Quiz: week 9 and before (cumulative)
Fri, Nov 8
Review
Week 11
Mon, Nov 11
Midterm (in class)
Wed, Nov 13
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Quiz: week 10 and before (cumulative)
Fri, Nov 15
Streaming: Kafka Demos
Week 12
Mon, Nov 18
Streaming: Kafka Reliability
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Wed, Nov 20
Streaming: Spark Programming
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 8, "Structured Streaming")
Due: P6
Released: P7 (Kafka, Weather Stations)
Quiz: week 11 and before (cumulative)
Fri, Nov 22
Streaming: Spark Concepts
Part 3: Cloud
Week 13
Mon, Nov 25
The Cloud
Wed, Nov 27
Big Query 1
Quiz: week 12 and before (cumulative)
Fri, Nov 29
Thanksgiving Break
Week 14
Mon, Dec 2
Big Query 2
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Wed, Dec 4
Big Query 3
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Due: P7
Released: P8 (BigQuery, Loans)
Quiz: week 13 and before (cumulative)
Fri, Dec 6
Big Query 4
Week 15
Mon, Dec 9
Cloud Deployment
Wed, Dec 11
Review
Due: P8
Mon, Nov 11
Midterm (in class)
Wed, Nov 13
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")Quiz: week 10 and before (cumulative)
Fri, Nov 15
Streaming: Kafka Demos
Mon, Nov 18
Streaming: Kafka Reliability
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")Wed, Nov 20
Streaming: Spark Programming
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 8, "Structured Streaming")Due: P6
Released: P7 (Kafka, Weather Stations)
Quiz: week 11 and before (cumulative)
Fri, Nov 22
Streaming: Spark Concepts
Part 3: Cloud
Week 13
Mon, Nov 25
The Cloud
Wed, Nov 27
Big Query 1
Quiz: week 12 and before (cumulative)
Fri, Nov 29
Thanksgiving Break
Week 14
Mon, Dec 2
Big Query 2
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Wed, Dec 4
Big Query 3
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Due: P7
Released: P8 (BigQuery, Loans)
Quiz: week 13 and before (cumulative)
Fri, Dec 6
Big Query 4
Week 15
Mon, Dec 9
Cloud Deployment
Wed, Dec 11
Review
Due: P8
Mon, Nov 25
The Cloud
Wed, Nov 27
Big Query 1
Quiz: week 12 and before (cumulative)
Fri, Nov 29
Thanksgiving Break
Mon, Dec 2
Big Query 2
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")Wed, Dec 4
Big Query 3
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")Due: P7
Released: P8 (BigQuery, Loans)
Quiz: week 13 and before (cumulative)