Nanodegree key: nd0277
Version: 2.2.5
Locale: en-us
Learn to design data models, build data warehouses, build data lakes and lakehouse architecture, create data pipelines, and work with large datasets on the Azure platform using Azure Synapse Analytics, Azure Databricks, and Azure Data Factory.
Content
Part 01 : Welcome to the Nanodegree Program!
Welcome to Udacity! We're excited to share more about your nanodegree and start this journey with you!
-
Module 01: Welcome to the Nanodegree Program!
-
Lesson 01: Welcome!
Welcome to Udacity. Takes 5 minutes to get familiar with Udacity courses and gain some tips to succeed in courses.
-
Lesson 02: Getting Help
You are starting a challenging but rewarding journey! Take 5 minutes to read how to get help with projects and content.
-
Part 02 : Data Modeling
Learn to create relational and NoSQL data models to fit the diverse needs of data consumers. Use ETL to build databases in PostgreSQL and Apache Cassandra.
-
Module 01: Data Modeling
-
Lesson 01: Introduction to Data Modeling
In this lesson, students will learn the basic difference between relational and non-relational databases, and how each type of database fits the diverse needs of data consumers.
- Concept 01: Introduction to the Course
- Concept 02: What is Data Modeling?
- Concept 03: Why is Data Modeling Important?
- Concept 04: Who does this type of work?
- Concept 05: Intro to Relational Databases
- Concept 06: Relational Database Quiz
- Concept 07: When to use a relational database?
- Concept 08: ACID Transactions
- Concept 09: When Not to Use a Relational Database
- Concept 10: What is PostgreSQL?
- Concept 11: Demos: Creating a Postgres Table
- Concept 12: Jupyter Workspace - Overview
- Concept 13: Exercise 1: Creating a Table with Postgres
- Concept 14: Solution for Exercise 1: Create a Table with Postgres
- Concept 15: NoSQL Databases
- Concept 16: What is Apache Cassandra?
- Concept 17: When to Use a NoSql Database
- Concept 18: When Not to Use a NoSql Database
- Concept 19: Demo 2: Creating table with Cassandra
- Concept 20: Exercise 2: Create table with Cassandra
- Concept 21: Solution for Exercise 2: Create table with Cassandra
- Concept 22: Conclusion
-
Lesson 02: Relational Data Models
In this lesson, students understand the purpose of data modeling, the strengths and weaknesses of relational databases, and create schemas and tables in Postgres
- Concept 01: Lesson Overview
- Concept 02: Databases
- Concept 03: Importance of Relational Databases
- Concept 04: OLAP vs OLTP
- Concept 05: Quiz 1
- Concept 06: Structuring the Database: Normalization
- Concept 07: Objectives of Normal Form
- Concept 08: Normal Forms
- Concept 09: Demo 1: Creating Normalized Tables
- Concept 10: Exercise 1: Creating Normalized Tables
- Concept 11: Solution: Exercise 1: Creating Normalized Tables
- Concept 12: Denormalization
- Concept 13: Demo 2: Creating Denormalized Tables
- Concept 14: Denormalization Vs. Normalization
- Concept 15: Exercise 2: Creating Denormalized Tables
- Concept 16: Solution: Exercise 2: Creating Denormalized Tables
- Concept 17: Fact and Dimension Tables
- Concept 18: Star Schemas
- Concept 19: Benefits of Star Schemas
- Concept 20: Snowflake Schemas
- Concept 21: Demo 3: Creating Fact and Dimension Tables
- Concept 22: Exercise 3: Creating Fact and Dimension Tables
- Concept 23: Solution: Exercise 3: Creating Fact and Dimension Tables
- Concept 24: Data Definition and Constraints
- Concept 25: Upsert
- Concept 26: Conclusion
-
Lesson 03: NoSQL Data Models
Students will understand when to use non-relational databases based on the data business needs, their strengths and weaknesses, and how to creates tables in Apache Cassandra.
- Concept 01: Lesson Overview
- Concept 02: Non-Relational Databases
- Concept 03: Distributed Databases
- Concept 04: CAP Theorem
- Concept 05: Quiz 1
- Concept 06: Denormalization in Apache Cassandra
- Concept 07: CQL
- Concept 08: Demo 1
- Concept 09: Exercise 1
- Concept 10: Exercise 1 Solution
- Concept 11: Primary Key
- Concept 12: Primary Key Quiz
- Concept 13: Demo 2
- Concept 14: Exercise 2
- Concept 15: Exercise 2: Solution
- Concept 16: Clustering Columns
- Concept 17: Demo 3
- Concept 18: Exercise 3
- Concept 19: Exercise 3: Solution
- Concept 20: WHERE Clause
- Concept 21: Demo 4
- Concept 22: Exercise 4
- Concept 23: Lesson Wrap Up
- Concept 24: Course Wrap Up
-
Lesson 04: Project: Data Modeling with Apache Cassandra
Students will model event data to create a non-relational database and ETL pipeline for a music streaming app. They will define queries and tables for a database built using Apache Cassandra.
-
Part 03 : Cloud Data Warehouses with Azure
In this course, you will learn to create cloud-based data warehouses and sharpen your data warehousing skills, deepen your knowledge of data infrastructure, and be introduced to data engineering on the cloud using Azure.
-
Module 01:
-
Lesson 01: Introduction to Cloud Data Warehouses with Azure
In this lesson, you'll learn about the course, including the prerequisites, tools, environment, and course project.
- Concept 01: Welcome to Cloud Data Warehouses
- Concept 02: Course Outline
- Concept 03: Prerequisites
- Concept 04: Meet Your Instructor
- Concept 05: History of Data Warehouses
- Concept 06: Roles and Stakeholders
- Concept 07: Project Preview
- Concept 08: Tools and Environment
- Concept 09: Sign in to Azure and Monitor Costs
- Concept 10: Lesson Summary
-
Lesson 02: Introduction to Data Warehouses
In this lesson, you'll be introduced to data warehouses, ETL, and OLAP cubes.
- Concept 01: Lesson Introduction
- Concept 02: Data Warehouse: Business Perspective
- Concept 03: Operational vs. Analytical Processes
- Concept 04: Data Warehouse: Technical Perspective
- Concept 05: Dimensional Modeling
- Concept 06: ETL Demo: Step 1 & 2
- Concept 07: Exercise 1: Step 1 & 2
- Concept 08: ETL Demo: Step 3
- Concept 09: Exercise 1: Step 3
- Concept 10: ETL Demo: Step 4
- Concept 11: Exercise 1: Step 4
- Concept 12: ETL Demo: Step 5
- Concept 13: Exercise 1: Step 5
- Concept 14: ETL Demo: Step 6
- Concept 15: Exercise 1: Step 6
- Concept 16: Exercise Solution 1: 3NF to Star Schema
- Concept 17: DWH Architecture: Kimball's Bus Architecture
- Concept 18: DWH Architecture: Independent Data Marts
- Concept 19: DWH Architecture: CIF
- Concept 20: DWH Architecture: Hybrid Bus & CIF
- Concept 21: OLAP Cubes
- Concept 22: OLAP Cubes: Roll-Up and Drill Down
- Concept 23: OLAP Cubes: Slice and Dice
- Concept 24: OLAP Cubes: Query Optimization
- Concept 25: OLAP Cubes Demo: Slicing & Dicing
- Concept 26: Exercise 2 Part 1: Slicing & Dicing
- Concept 27: OLAP Cubes Demo: Roll-Up
- Concept 28: Exercise 2 Part 2: Roll-Up & Drill Down
- Concept 29: OLAP Cubes Demo: Grouping Sets
- Concept 30: Exercise 2 Part 3: Grouping Sets
- Concept 31: OLAP Cubes Demo: CUBE
- Concept 32: Exercise 2 Part 4: CUBE
- Concept 33: Exercise 2: Solution
- Concept 34: Data Warehouse Technologies
- Concept 35: Demo: Column format in ROLAP
- Concept 36: Exercise 3: Column format in ROLAP
- Concept 37: Solution: Column format in ROLAP
- Concept 38: Lesson Review
-
Lesson 03: ELT and Data Warehouse Technology in the Cloud
In this lesson, you'll learn about ELT, the differences between ETL and ELT, and general cloud data warehouse technologies.
- Concept 01: Introduction to Cloud Data Warehouses
- Concept 02: Lesson Outline
- Concept 03: Expert Perspective: Cloud Data Warehouses
- Concept 04: From ETL to ELT
- Concept 05: ELT Quizzes
- Concept 06: Exercise: ELT
- Concept 07: Solution: ELT
- Concept 08: Cloud Managed SQL Storage
- Concept 09: Cloud Managed NoSQL Storage
- Concept 10: Cloud Data Storage Quizzes
- Concept 11: Cloud ETL Pipeline Services
- Concept 12: Cloud Pipeline Services Quizzes
- Concept 13: Cloud Data Warehouse Solutions
- Concept 14: Cloud Data Warehouse Solutions Quizzes
- Concept 15: Lesson Review
-
Lesson 04: Azure Data Warehouse Technologies
In this lesson, you will learn about specific data warehouse technologies and solutions in Azure.
- Concept 01: Introduction to Azure Data Warehouse Technologies
- Concept 02: Lesson Outline
- Concept 03: Expert Perspective: Data Warehouse on Azure
- Concept 04: Azure Data Warehouse Solutions
- Concept 05: Demo: Configuring Azure Synapse and Tour
- Concept 06: Azure Data Warehouse Solutions Quizzes
- Concept 07: Azure Data Warehouse Components
- Concept 08: Demo: Azure Storage Components
- Concept 09: Exercise: Azure Components
- Concept 10: Solution: Azure Components
- Concept 11: Azure Components Quizzes
- Concept 12: When To Use Azure Data Warehouse Components
- Concept 13: Lesson Review
-
Lesson 05: Implementing Data Warehouses in Azure
In this lesson, you will have the opportunity to implement a data warehouse in Azure using Synapse.
- Concept 01: Lesson Introduction
- Concept 02: Data Warehouses in Azure: a Closer Look
- Concept 03: Azure Data Warehouse Architectures
- Concept 04: Azure Data Warehouse Architectures Quiz
- Concept 05: Exercise: Azure Synapse Workspace
- Concept 06: Solution: Azure Synapse Workspace
- Concept 07: Ingesting Data at Scale into Azure Synapse
- Concept 08: Demo: Ingesting Data into Azure Synapse
- Concept 09: Ingesting Data at Scale in Azure Quiz
- Concept 10: SQL to SQL ELT in Azure
- Concept 11: Demo: Creating Staging Tables using Azure Synapse
- Concept 12: Exercise: SQL to SQL ELT in Azure
- Concept 13: Solution: SQL to SQL ELT in Azure
- Concept 14: Lesson Review
- Concept 15: Course Summary
-
Lesson 06: Building an Azure Data Warehouse for Bike Share Data Analytics
In this project, you will develop a data warehouse solution using Azure Synapse Analytics to analyze bike share data.
Project Description - Project: Building an Azure Data Warehouse for Bike Share Data Analytics
Project Rubric - Project: Building an Azure Data Warehouse for Bike Share Data Analytics
-
Part 04 : Data lakes and Lakehouses with Spark and Azure Databricks
Learn about the big data ecosystem and how to use Spark to work with massive datasets. Learners will also store big data in a data lake and develop Lakehouse architecture on the Azure Databricks platform.
-
Module 01:
-
Lesson 01: Course Introduction
In this lesson, you'll learn about the course, including the prerequisites, tools, environment, and course project.
-
Lesson 02: Big Data Ecosystem, Data Lakes, and Spark
In this lesson, you will learn about the problems that Apache Spark is designed to solve. You'll also learn about the greater Big Data ecosystem and how Spark fits into it.
- Concept 01: Introduction to Big Data Ecosystem, Spark, and Data Lakes
- Concept 02: Lesson Outline
- Concept 03: Insight: From Hadoop to Data Lakehouse
- Concept 04: The Hadoop Ecosystem
- Concept 05: MapReduce
- Concept 06: Hadoop MapReduce Exercise
- Concept 07: Why Spark?
- Concept 08: The Spark Cluster
- Concept 09: Spark Use Cases
- Concept 10: Data Lakes
- Concept 11: Data Lakes, Lakehouse, and Spark
- Concept 12: Data Lakehouse
- Concept 13: Summary
-
Lesson 03: Data Wrangling with Spark
In this lesson, we'll dive into how to use Spark for cleaning and aggregating data.
- Concept 01: Lesson Outline
- Concept 02: Functional Programming
- Concept 03: Why Use Functional Programming
- Concept 04: Procedural Example
- Concept 05: Procedural Programming Exercise
- Concept 06: Pure Functions in the Bread Factory
- Concept 07: The Spark DAGs: Recipe for Data
- Concept 08: Maps and Lambda Functions
- Concept 09: Maps and Lambda Functions Exercise
- Concept 10: Data Formats
- Concept 11: Distributed Data Stores
- Concept 12: SparkSession
- Concept 13: Reading and Writing Data into Spark Data Frames
- Concept 14: Read and Write Data into Spark Data Frames Exercise
- Concept 15: Imperative vs Declarative programming
- Concept 16: Data Wrangling with DataFrames
- Concept 17: Data Wrangling with DataFrames Extra Tips
- Concept 18: Data Wrangling with Spark Exercise
- Concept 19: Quiz - Data Wrangling with DataFrames
- Concept 20: Quiz - Data Wrangling with DataFrames Jupyter Notebook
- Concept 21: Quiz - Data Wrangling with DataFrames Solution Code
- Concept 22: Spark SQL
- Concept 23: Example Spark SQL
- Concept 24: Example Spark SQL Exercise
- Concept 25: Quiz - Data Wrangling with SparkSQL
- Concept 26: Quiz Data Wrangling with SparkSQL Solution
- Concept 27: RDDs
- Concept 28: Lesson Summary
-
Lesson 04: Spark Debugging and Optimization
In this lesson, you will learn best practices for debugging and optimizing your Spark applications.
- Concept 01: Lesson Outline
- Concept 02: Debugging is Hard
- Concept 03: Intro: Syntax Errors
- Concept 04: Code Errors
- Concept 05: Data Errors
- Concept 06: Debugging your Code
- Concept 07: How to Use Accumulators
- Concept 08: Spark Broadcast
- Concept 09: Spark Broadcast Exercise
- Concept 10: Different types of Spark Functions
- Concept 11: Introduction to Code Optimization
- Concept 12: Understanding Data Skewness
- Concept 13: Example of Data Skewness
- Concept 14: Optimizing for Data Skewness
- Concept 15: Other Issues and How to Address Them
- Concept 16: Lesson Summary
-
Lesson 05: Azure Databricks
In this lesson, you'll create Spark Clusters and Spark code on the Azure Databricks platform.
- Concept 01: Introduction to Azure Databricks
- Concept 02: Lesson Outline
- Concept 03: Expert Insight: Azure Databricks
- Concept 04: Azure Databricks
- Concept 05: Databricks Quizzes
- Concept 06: Using Spark on Azure
- Concept 07: Demo: Create Databricks Workspace
- Concept 08: Spark on Azure Quizzes
- Concept 09: Creating a Spark Cluster in Databricks
- Concept 10: Demo: Creating Spark Cluster
- Concept 11: Writing Spark Scripts in Databricks
- Concept 12: Demo: Reading and Writing Data
- Concept 13: Exercise: Reading and Writing Data in Databricks
- Concept 14: Writing Spark Scripts in Databricks Quizzes
- Concept 15: Lesson Review
-
Lesson 06: Data Lakes and Lakehouse with Azure Databricks
In this lesson, you'll create data lakes and Lakehouse architecture on the Azure Databricks platform
- Concept 01: Introduction to Data Lakes and Lakehouse with Azure Databricks
- Concept 02: Lesson Outline
- Concept 03: Expert Insight: Data Lake and Lakehouse on Azure
- Concept 04: Azure Data Lake Gen2
- Concept 05: Azure Data Lake Gen2 quizzes
- Concept 06: Delta Lake using Azure Databricks
- Concept 07: Demo: Uploading files to Delta using Databricks
- Concept 08: Demo: Ingesting Data
- Concept 09: Creating and Deleting Tables
- Concept 10: Demo: Creating and Deleting Tables
- Concept 11: Reading and Writing Data
- Concept 12: Exercise: Data Wrangling in Databricks
- Concept 13: Exercise: SQL Data Wrangling in Databricks
- Concept 14: Azure Delta Lake Quizzes
- Concept 15: Building a Data Lake using Databricks and Delta Lake
- Concept 16: Demo: Processing Delta Lake Table Stages
- Concept 17: Lesson Review
-
Lesson 07: Building an Azure Data Lake for Bike Share Data Analytics
In this project, you'll implement Lakehouse architecture on the Azure Databricks platform.
-
Part 05 : Data Pipelines with Azure
In this course, you’ll learn to build, orchestrate, automate and monitor data pipelines in Azure using Azure Data Factory and pipelines in Azure Synapse Analytics. You’ll build, trigger, and monitor data pipelines on the Azure platform for analytical workloads and run data transformations, optimize data flows, and work with data pipelines in production.
-
Module 01: Data Pipelines with Azure
-
Lesson 01: Introduction to Data Pipelines
Welcome to Data Pipelines with Azure! In this lesson, you'll learn more about the course, including the prerequisites and tools you'll be using.
-
Lesson 02: Azure Data Pipeline Components
In this lesson, you'll learn about the components of data pipelines in Azure Data Factory and Synapse Analytics.
- Concept 01: Introduction
- Concept 02: Lesson Outline
- Concept 03: Expert Perspective: Data Pipelines
- Concept 04: Pipelines and Activities
- Concept 05: Data Pipeline Components Quiz
- Concept 06: Exercise: Create Azure Resources
- Concept 07: Solution: Creating Azure Resources
- Concept 08: Pipeline Component: Linked Services
- Concept 09: Linked Services Quizzes
- Concept 10: Exercise: Creating Linked Services
- Concept 11: Pipeline Components: Datasets
- Concept 12: Dataset Quizzes
- Concept 13: Exercise: Create Datasets
- Concept 14: Solution: Create Datasets
- Concept 15: Integration Runtimes
- Concept 16: Integration Runtime Quizzes
- Concept 17: Exercise: Create Integration Runtimes
- Concept 18: Lesson Review
-
Lesson 03: Transforming Data in Azure Data Pipelines
In this lesson, you'll learn about transforming data as it moves through activities in a pipeline.
- Concept 01: Introduction
- Concept 02: Lesson Outline
- Concept 03: What are Mapping Data Flows?
- Concept 04: Mapping Dataflows Quiz
- Concept 05: Expression Builder
- Concept 06: Expression Builder Quiz
- Concept 07: Exercise: Mapping Dataflows
- Concept 08: Solution: Mapping Dataflows
- Concept 09: Exercise: Transform and Aggregate Data using Data Flows
- Concept 10: Solution: Transform and Aggregate Data using Data Flows
- Concept 11: Exercise: Create Pipeline Activity
- Concept 12: Trigger and Debug Pipelines
- Concept 13: Exercise: Debug, Trigger and Monitor Pipelines
- Concept 14: ADF and Synapse Pipelines Quiz
- Concept 15: Transforming Data on External Computes
- Concept 16: Exercise: Notebooks using Synapse Pipelines
- Concept 17: What is Power Query?
- Concept 18: Power Query Quiz
- Concept 19: Lesson Review
-
Lesson 04: Azure Pipeline Data Quality
In this lesson, you’ll learn about optimizing Azure Data Factory and Synapse Analytics Pipelines for data quality and integrity.
-
Lesson 05: Azure Data Pipelines in Production
In this lesson, you'll learn about features of Azure Data Factory and Synapse Analytics pipelines that you’ll use in production.
- Concept 01: Introduction
- Concept 02: Lesson Outline
- Concept 03: Parameterizing Pipelines
- Concept 04: Parameterizing Pipelines Quiz
- Concept 05: Exercise: Parameterizing Pipelines
- Concept 06: Creating ADF Objects Programmatically
- Concept 07: Creating Objects Programmatically Quiz
- Concept 08: Exercise: Create ADF Objects with Azure CLI
- Concept 09: Automated Pipelines with DevOps
- Concept 10: Pipeline DevOps Quiz
- Concept 11: Exercise: Automated Pipelines with DevOps
- Concept 12: Lesson Review
-
Lesson 06: Data Integration Pipelines for NYC Payroll Data Analytics
In this project, you'll build data pipelines for an analytics platform for the City of New York.
Project Description - Data Integration Pipelines for NYC Payroll Data Analytics
Project Rubric - Data Integration Pipelines for NYC Payroll Data Analytics
- Concept 01: Project Overview
- Concept 02: Environments
- Concept 03: Sign in to Azure and Monitor Costs
- Concept 04: Instructions: Create and Configure Resources
- Concept 05: Instructions: Create Linked Services
- Concept 06: Instructions: Create Datasets
- Concept 07: Instructions: Create Data Flows and Pipelines
- Concept 08: Instructions: Aggregate Data Flow
- Concept 09: Instructions: Connect Project to Github and Submit
-
Part 06 : Congratulations!
Congratulations on finishing your program!
-
Module 01: Congratulations!
-
Lesson 01: Congratulations!
Congratulations on your graduation from this program! Please join us in celebrating your accomplishments.
-
Part 07 (Career): Career Services
The Careers team at Udacity is here to help you move forward in your career - whether it's finding a new job, exploring a new career path, or applying new skills to your current job.
-
Module 01:
-
Lesson 01: Take 30 Min to Improve your LinkedIn
Find your next job or connect with industry peers on LinkedIn. Ensure your profile attracts relevant leads that will grow your professional network.
- Concept 01: Get Opportunities with LinkedIn
- Concept 02: Use Your Story to Stand Out
- Concept 03: Why Use an Elevator Pitch
- Concept 04: Create Your Elevator Pitch
- Concept 05: Use Your Elevator Pitch on LinkedIn
- Concept 06: Create Your Profile With SEO In Mind
- Concept 07: Profile Essentials
- Concept 08: Work Experiences & Accomplishments
- Concept 09: Build and Strengthen Your Network
- Concept 10: Reaching Out on LinkedIn
- Concept 11: Boost Your Visibility
- Concept 12: Up Next
-
Lesson 02: Optimize Your GitHub Profile
Other professionals are collaborating on GitHub and growing their network. Submit your profile to ensure your profile is on par with leaders in your field.
- Concept 01: Prove Your Skills With GitHub
- Concept 02: Introduction
- Concept 03: GitHub profile important items
- Concept 04: Good GitHub repository
- Concept 05: Interview with Art - Part 1
- Concept 06: Identify fixes for example “bad” profile
- Concept 07: Quick Fixes #1
- Concept 08: Quick Fixes #2
- Concept 09: Writing READMEs with Walter
- Concept 10: Interview with Art - Part 2
- Concept 11: Commit messages best practices
- Concept 12: Reflect on your commit messages
- Concept 13: Participating in open source projects
- Concept 14: Interview with Art - Part 3
- Concept 15: Participating in open source projects 2
- Concept 16: Starring interesting repositories
- Concept 17: Next Steps
-