Course Overview
Azure Databricks is a fully managed, cloud-based data analytics platform, which empowers developers to accelerate AI and innovation by simplifying the process of building enterprise-grade data applications. Built as a joint effort by Microsoft and the team that started Apache Spark, Azure Databricks provides data science, engineering, and analytical teams with a single platform for big data processing and machine learning. In this course, you’ll learn how to use Azure Databricks to train and deploy machine learning models.
Who should attend
This course is designed for aspiring data scientists and AI engineers who need to train and manage machine learning models by using Azure Databricks.
Prerequisites
This learning path assumes that you have experience of using Python to explore data and train machine learning models with common open source frameworks, like Scikit-Learn, PyTorch, and TensorFlow. Consider completing the Create machine learning models learning path before starting this one.
Course Content
Explore Azure Databricks
- Introduction
- Get started with Azure Databricks
- Identify Azure Databricks workloads
- Understand key concepts
- Data governance using Unity Catalog and Microsoft Purview
- Exercise - Explore Azure Databricks
- Module assessment
- Summary
Use Apache Spark in Azure Databricks
- Introduction
- Get to know Spark
- Create a Spark cluster
- Use Spark in notebooks
- Use Spark to work with data files
- Visualize data
- Exercise - Use Spark in Azure Databricks
- Module assessment
- Summary
Train a machine learning model in Azure Databricks
- Introduction
- Understand principles of machine learning
- Machine learning in Azure Databricks
- Prepare data for machine learning
- Train a machine learning model
- Evaluate a machine learning model
- Exercise - Train a machine learning model in Azure Databricks
- Module assessment
- Summary
Use MLflow in Azure Databricks
- Introduction
- Capabilities of MLflow
- Run experiments with MLflow
- Register and serve models with MLflow
- Exercise - Use MLflow in Azure Databricks
- Module assessment
- Summary
Tune hyperparameters in Azure Databricks
- Introduction
- Optimize hyperparameters with Optuna
- Review trials
- Scale hyperparameter optimization
- Exercise - Optimize hyperparameters for machine learning in Azure Databricks
- Module assessment
- Summary
Use AutoML in Azure Databricks
- Introduction
- What is AutoML?
- Use AutoML in the Azure Databricks user interface
- Use code to run an AutoML experiment
- Exercise - Use AutoML in Azure Databricks
- Module assessment
- Summary
Train deep learning models in Azure Databricks
- Introduction
- Understand deep learning concepts
- Train models with PyTorch
- Distribute PyTorch training with TorchDistributor
- Exercise - Train deep learning models on Azure Databricks
- Module assessment
- Summary
Manage machine learning in production with Azure Databricks
- Introduction
- Automate your data transformations
- Explore model development
- Explore model deployment strategies
- Explore model versioning and lifecycle management
- Exercise - Manage a machine learning model
- Module assessment
- Summary