Distributed systems to AI platforms with Mark Russinovich & Ion Stoica | BRK227
Mark Russinovich and Ion Stoica discuss how AI platforms need to evolve for agentic, multimodal, globally distributed workloads, covering infrastructure fundamentals, training and real-time serving architectures, and why open source, security, and governance are becoming core platform requirements.
Overview
Azure CTO Mark Russinovich and UC Berkeley professor Ion Stoica explore what it takes to build next-generation AI platforms as systems become more agentic, multimodal, and globally distributed.
Key themes include:
- How distributed-systems fundamentals translate to modern AI infrastructure
- The evolution of datacenters into large-scale AI supercomputing regions
- The role of serverless computing in AI workloads
- Architectural implications of agentic AI systems
- Cross-layer optimization challenges across algorithms, hardware, and system architecture
- The importance of open source infrastructure stacks
- Security, governance, and confidential computing as core platform concerns
Session metadata
- Event: Microsoft Build 2026
- Session: BRK227 (Cloud platform & data)
- Level: Intermediate
- Resource link: https://aka.ms/build26/BRK227
Chapters (from the video description)
- 0:00 - Introduction and session overview at BUILD conference
- 00:00:36 - Speaker backgrounds and history in AI systems such as Spark, Ray, and VLLM
- 00:04:07 - Discussion on fundamentals of distributed systems applied to AI infrastructure
- 00:10:19 - Evolution of data centers and rise of large-scale AI supercomputing regions
- 00:13:33 - Modern serverless computing and its role in AI workloads
- 00:14:25 - Emergence of agentic AI systems and architectural implications
- 00:19:31 - Optimization layers in AI infrastructure: algorithms, hardware, and architecture
- 00:22:03 - Open source AI infrastructure stack and challenges of cross-layer optimization
- 00:29:30 - Security, confidential computing, and protecting sensitive AI data
- 00:34:48 - Developer experience, code verification challenges, and discussion on future automation limits