Azure Storage for AI workloads | OD870

Saurabh Sensharma, Vishnu Charan TJ, and Saloni Sonpal walk through how Azure Storage can be used to improve performance and cost efficiency for AI inference workloads, including caching patterns, faster model distribution, and integrations across the AI stack.

Overview

The session covers how Azure Storage powers AI inference at scale, with an emphasis on:

Topics and chapters

Introduction to Azure Storage for AI workloads

Storage for AI and AI for Storage

Azure Storage integration across the AI stack and infrastructure

Azure Storage clients and tools for AI workloads

Paths to run AI workloads with storage

The presenters outline common execution environments where Azure Storage is used:

Storage requirements for agentic inference

Inference optimization through prompt caching

Explicit caching with Azure Blob and NIXL (demo)

Fast model loading and distribution

Bringing enterprise data to AI via Azure integrations

Storage Center and recap