Course Overview

Learn how to build neural network agents that reason across multiple data types using advanced fusion techniques, OCR, and NVIDIA AI Blueprints for real-world applications like robotics and healthcare.

Prerequisites

A basic understanding of Deep Learning Concepts.
Familiarity with a Deep Learning framework such as TensorFlow, PyTorch, or Keras. This course uses PyTorch.

Course Objectives

In this course, you will learn about:

Different data types and how to make them neural network ready
Model fusion, and the differences between early, late, and intermediate fusion
PDF extraction using OCR
The difference between modality and agent orchestration
Customization of NVIDIA AI Blueprints with Video Search and Summarization (VSS)

Course Content

We'll begin with a robotics use case to show how different datatypes impact an effective neural-networks architecture. The mathematical concepts we learn in the robotics use case can then be applied to Large Language Models (LLMs) in order to modify these powerful model to accept non-language data input. We'll end with orchestration where multiple models work together to answer user queries.

Prices & Delivery methods

Online Training

Duration
8 hours

Price

£ 420.—

Book now

Enquire a date

Classroom Training

Duration
8 hours

Price

United Kingdom: £ 420.—

Book now

Enquire a date

Currently there are no training dates scheduled for this course.

Enquire a date

Price (excl. VAT)

Price (excl. VAT)

Building AI Agents with Multimodal Models (BAAMM)

Course Overview

Prerequisites

Course Objectives

Course Content

Prices & Delivery methods

Online Training

Price

Classroom Training

Price