Token Billing, Interoperable Agents, and Practical Cloud Controls

This week’s roundup is about turning agentic tooling into something teams can run, budget, and govern. GitHub Copilot’s shift to token-based billing and AI Credits makes cost a first-class part of rollout checklists, especially as agent-style IDE and PR workflows expand and code review begins consuming both AI Credits and GitHub Actions minutes. On the platform side, GPT-5.5 in Microsoft Foundry, Microsoft Agent Framework 1.0, and A2A/MCP interoperability point toward more standardized agent runtimes, while Azure and Fabric updates reinforce the same operational theme: tighter identity, clearer observability, and more precise controls in both connected and constrained environments.

This Week's Overview

GitHub Copilot

GitHub Copilot news this week was split between two practical realities teams have to plan for: Copilot is getting more “agentic” inside IDEs and GitHub itself, and it is about to get a lot more measurable (and therefore manageable) through token-based, usage-driven billing starting June 1, 2026. That measurement thread builds directly on last week's focus on governance catching up with autonomy (data residency, admin policies, and more explicit controls across IDE, CLI, and github.com). The difference now is that cost becomes part of the same rollout checklist as permissions and compliance.

Copilot pricing shifts to tokens, AI Credits, and tighter org controls

GitHub confirmed a billing shift that will change how many teams think about Copilot day-to-day: on June 1, 2026, Copilot moves to usage-based billing built around token consumption. Premium Request Units (PRUs) go away in favor of GitHub AI Credits, which are consumed based on how many tokens your Copilot interactions use. For organizations, the key operational change is visibility and control - GitHub is adding better billing reporting, plus budget controls so admins can set limits and track where AI Credits are going before surprise bills show up at month end.

This pricing model also pulls Copilot closer to how LLM costs work everywhere else: longer chats, bigger context windows, and more “agent” activity generally translate into more tokens consumed. That lands right on top of last week's shift toward more capable agents in more places (PR buttons and @copilot mentions on github.com, remote-controlled CLI sessions, and IDE workflows that encourage larger task scopes). If you have internal guidance like “use Copilot Chat to paste logs” or “let the agent refactor whole modules,” this is the week to start pairing that guidance with budget policy, reporting expectations, and a shared understanding of what drives token use. A companion explainer video walks through what token-based billing means in practice and how to reason about cost when Copilot usage varies by developer, repo, and workflow.

Copilot code review will cost AI Credits and GitHub Actions minutes (private repos)

Alongside the broad billing switch, GitHub called out a very specific new meter to watch: starting June 1, 2026, GitHub Copilot code review will be billed via AI Credits and will also consume GitHub Actions minutes when you run reviews on private repositories. That matters because many teams already budget Actions minutes tightly, and Copilot code review adds a new kind of Actions workload that can run more often than CI if developers trigger it repeatedly during review cycles.

This is a clean continuation of last week's “agent work stays inside the normal audit trail” theme. Copilot review and PR-focused agents shorten the loop from findings to fixes, but they do it by running more automated work around PRs, which now shows up both in spend (AI Credits) and in pipeline capacity (Actions minutes). The guidance focuses on the knobs teams can actually turn: monitor usage, set budgets, and choose the right runner strategy. GitHub-hosted runners are convenient but draw down your Actions minutes, while self-hosted runners can shift compute cost and capacity planning back to you (and may be the better fit if you expect heavy agent-style review usage). If Copilot review becomes a standard step in your PR workflow, you will want to treat it like any other CI job: decide when it runs, who can trigger it, and what “good enough” looks like so it does not get rerun endlessly.

Visual Studio 2026 brings cloud agent sessions, custom agents, and a debugger agent workflow

On the tooling side, Visual Studio 2026 got an April update that pushes Copilot further into agent workflows directly inside the IDE. The headline is cloud agent integration: you can launch GitHub Copilot cloud agent sessions from Visual Studio, which matters for developers who want to kick off larger tasks (issue-driven fixes, refactors, multi-file changes) without leaving their editor. This connects directly to last week's push to move agent work onto github.com (merge conflict fixes, PR @copilot workflows) and to treat the cloud agent as a first-class runtime, not a side feature. Visual Studio joining that path is a step toward a more continuous “issue or PR → agent work → review → merge” loop across surfaces.

This update also expands custom agents by allowing user-level definitions, making it easier for individuals (and eventually teams) to standardize agent behavior and reuse agent “skills” without requiring machine-wide setup. That lines up with last week's “skills ecosystem grows up” story (Custom Skills, gh skill, and governed catalogs): more of Copilot customization is becoming portable and repeatable, even if the IDE feature and the repo-resident skill format are still evolving in parallel.

C++ developers get more attention here too, with C++ agent tools moving to GA and improvements aimed at navigation and editing. There is also a Debugger Agent workflow, designed to make Copilot more useful when the work is not “write new code” but “figure out why this is failing.” The related GitHub changelog entry adds more detail on how the debugger agent flow can start from GitHub or Azure DevOps issues, then carry context into the IDE, plus quality-of-life improvements like chat history, shortcut customization, completion ergonomics, and auto-decoding in the Text Visualizer. Put together, the trend is clear: Copilot is being treated less like a chat box and more like a set of task-oriented tools embedded in the daily debugging and issue-to-fix loop, echoing last week's emphasis on agent workflows that still respect policy and review boundaries.

Model availability changes: upcoming GPT-5.2 deprecation and Student picker adjustments

Admins and educators got a reminder that “which model are we using” is now an operational concern, not trivia. GitHub announced that GPT-5.2 and GPT-5.2-Codex will be deprecated across Copilot experiences on June 1, 2026, with GPT-5.5 and GPT-5.3-Codex positioned as replacements. For Copilot Enterprise, this can require actual admin action: model policies may need updates so the replacement models appear as selectable options in Copilot Chat on github.com and in VS Code. If your org standardizes on a specific model for consistency, testing, or compliance, you should validate those policies before June.

This is the same management lesson as last week, just from the opposite direction: last week's Opus 4.7 rollout (and replacement of older Opus pickers) showed how quickly model menus change even when capability is “GA.” Now deprecations make that churn explicit, and the upcoming token billing makes it more consequential because model choice and context size can translate directly into spend. It also pairs with last week's data residency reality that some models are not available in some regions, so “standardize on model X” may need a fallback story per tenant and geography.

Separately, GitHub adjusted the Copilot Student model picker by removing GPT-5.3-Codex as a manual selection option (it remains available via auto model selection). GitHub framed this as a temporary reliability and performance measure ahead of the broader usage-based billing transition, which is a good signal that model availability may continue to shift as GitHub tunes capacity and cost controls. The practical takeaway is to avoid teaching workflows or internal docs that depend on students (or any users) always seeing a specific model in a picker, which matches last week's advice to document intent (speed/cost/reasoning) rather than pinning to exact versions.

Copilot agents and protocols: faster cloud startup and ACP for CLI-driven workflows

GitHub continued tightening the mechanics behind agent workflows, starting with a performance win: Copilot cloud agent now starts more than 20% faster by using optimized runner environments built with GitHub Actions custom images. In practice, that reduces the “waiting for environment” overhead when an agent begins work from an issue, a PR, or the Agents tab, which is the kind of latency that determines whether teams keep using agent workflows or abandon them after the novelty wears off. It also follows naturally from last week's expansion of cloud-agent work inside github.com (merge conflict fixes and PR maintenance via @copilot): as more workflows depend on the cloud agent, startup time becomes a reliability feature, not a nice-to-have.

On the extensibility side, a community deep dive explained how GitHub Copilot CLI can run as an Agent Client Protocol (ACP) server. ACP is presented as a standard way for IDEs, CI/CD pipelines, and custom tools to connect to an agent over a streaming interface (NDJSON over stdin/stdout), with session handling and permissions as first-class concerns. This is a close cousin to last week's theme of turning Copilot into a configurable runtime with portable integrations (remote-controlled CLI sessions, MCP tooling, skills and plugin catalogs). For teams experimenting with “Copilot in CI” or building internal developer tools, the interesting part is the protocol shape: it suggests a path to swapping clients in and out while keeping the agent integration consistent, rather than building one-off glue for every environment.

Other GitHub Copilot News

Copilot's role in education got another careful look, focusing less on whether students “should” use AI assistants and more on how educators can design assignments and guidelines that reduce over-reliance and integrity problems while still letting Copilot lower friction for experimentation, debugging, and iteration. This fits with last week's training and adoption coverage that focused on safe, repeatable usage (instructions, skills, and workflow structure) rather than only “better prompts,” and it becomes more relevant as token billing makes usage patterns (and incentives) more visible.

Artificial Intelligence

This week in AI was mostly about turning “agents” into something you can actually run in production: newer frontier models landed in Azure AI Foundry, agent runtimes and protocols firmed up around interoperability, and the surrounding tooling (retrieval, UI streaming, observability, and even image workflows) filled in the gaps teams hit when they move from demos to governed, monitored systems.

Azure AI Foundry and Microsoft Foundry: GPT-5.5 lands, and the platform story gets clearer

Microsoft put GPT-5.5 into general availability inside Microsoft Foundry, positioning it as a more enterprise-ready way to consume frontier OpenAI models with the controls teams expect. The update calls out practical model-level gains like stronger agentic coding behavior, improved long-context reasoning, and better token efficiency, which matters when you are building multi-step agent flows that keep state, retrieve sources, and iterate. The more important shift is the surrounding platform: Foundry Agent Service is framed as the operational layer for hosted agents, with isolation boundaries, Microsoft Entra identity integration, and governance so teams can manage access and reduce the “who can make the agent do what” risk that shows up quickly in enterprise deployments.

On the developer experience side, Foundry Toolkit added an in-editor image generation loop in VS Code. The flow is designed to be end-to-end: browse the model catalog, deploy GPT-Image-2 into an Azure AI Foundry project, generate images in an Image Playground, then export code snippets you can drop into an app. It is a small update, but it reflects a pattern Foundry is pushing this month: keep model selection, deployment, experimentation, and integration in one place so teams do not treat “prompting” and “shipping” as two separate projects.

Microsoft Agent Framework 1.0 and A2A v1: interoperability becomes a first-class feature

Agent work this week was less about “how to build an agent” and more about “how agents talk to other agents and tools without custom glue.” Microsoft Agent Framework 1.0 reached GA with a clearer split between agent architecture and traditional workflow orchestration, plus a set of runtime components meant to standardize how you host, route, and invoke tools across deployments. Two interoperability pieces stand out. First is A2A (agent-to-agent), which is positioned as a common way for agents to communicate across platforms. Second is MCP (Model Context Protocol), which is used as the tool invocation and connectivity layer so an agent can call capabilities without every team inventing its own tool schema.

For .NET teams, A2A Protocol v1.0 support landed in Microsoft Agent Framework for .NET with updated client and hosting packages and some concrete hosting details that matter in production. Discovery is handled via well-known agent cards (so clients can find capabilities without hardcoding), and streaming is supported over Server-Sent Events (SSE), which is a practical fit for incremental agent output and UI updates. The post also calls out migration notes from v0.3, which helps teams who started experimenting earlier move to the now-stable protocol and hosting APIs in ASP.NET Core.

Foundry IQ and agentic retrieval on Azure AI Search: building knowledge copilots with fewer shortcuts

Enterprise RAG (retrieval-augmented generation) keeps running into the same hard problems: permissions, citations, query quality, and latency once you add multi-step reasoning. Two posts this month focused on Foundry IQ as the knowledge layer for that problem, backed by Azure AI Search. One walkthrough shows how to build an “enterprise knowledge copilot” using Foundry IQ knowledge bases with an agentic retrieval loop (plan, search, rank, reflect, synthesize). The point is not that agents can search, but that retrieval becomes iterative and self-correcting, which is often what it takes to get stable answers across messy internal content. It also leans on MCP-based connectivity so the agent can connect to tools and retrieval services in a standard way, while being candid about the real trade-offs teams need to measure (preview maturity, cost, latency, and security constraints).

A companion guide tries to reduce confusion about what Foundry IQ is and is not, when it fits, and how to use it without going through a full “copilot” layer. It highlights querying Foundry IQ knowledge bases directly with the Azure AI Search Python SDK (azure-search-documents), including returning citations and applying permission trimming so the retrieval layer respects user access (often enforced through Microsoft Entra ID patterns in enterprise environments). Together, the posts land on a pragmatic message: treat the knowledge base as a governed, queryable asset first, then decide how much agentic orchestration you actually need.

Agent-driven UI and operations: streaming interfaces and real observability for agent workloads

As agents become long-running and multi-step, the UI and ops story matters as much as the prompt. AG-UI was introduced as a protocol for how agents communicate with frontends using streaming events, declarative UI proposals, and shared state updates, with explicit support for human-in-the-loop interrupts so a user can approve, correct, or stop actions mid-flight. The design lines up with the same streaming patterns showing up elsewhere (notably SSE), and the security notes point teams back to common Azure controls like Azure AD (Microsoft Entra) for securing endpoints and Key Vault for handling secrets.

On the operations side, another post focused on observability for agent workloads in Azure AI Foundry, specifically what it takes to get traces you can actually use in incident response and performance work. It compares tracing depth and setup effort across Microsoft Agent Framework, Semantic Kernel, LangChain/LangGraph, and the OpenAI Agent SDK, using OpenTelemetry pipelines into Azure Monitor and Application Insights. The practical takeaway is that “agent observability” is not a single feature toggle; it is a combination of consistent spans/attributes across tool calls and model invocations, plus an instrumentation story that does not collapse once you add multiple agents, external tools, and streaming responses.

Other Artificial Intelligence News

Microsoft highlighted a real multi-agent workflow in science with Microsoft Discovery, describing an automated computational chemistry screening pipeline that runs Gaussian 16 DFT simulations in parallel (using an MPI-style master-worker pattern), then feeds results into ML-based redox potential prediction with structured JSON outputs so downstream systems can consume the results reliably.

A .NET-focused tutorial showed how to extend an agent built with the .NET Agent Framework by adding a class-based skill, using a locally hosted LLM through Ollama, which is a useful pattern when you want the “skills” abstraction without requiring a hosted model for every experiment.

Machine Learning

This week, the Machine Learning story was mostly about getting data into shape for ML and analytics at scale: Microsoft Fabric leaned further into OneLake as the common data layer, tightened up real-time streaming so features and signals can arrive with fewer surprises, and nudged SQL developers toward a more modern, Git-friendly workflow in VS Code. Alongside those platform updates, Microsoft also shared an early look at how unconventional hardware (and its digital twins) might run real lending models in the future.

Microsoft Fabric and OneLake: broader access to governed data and metadata

Fabric expanded the practical ways teams can discover and reuse data without copying it around. A new preview feature, “Mirrored Dremio catalog”, mirrors Dremio-managed Apache Iceberg catalog metadata into OneLake using shortcuts, which keeps access effectively zero-copy while still making those tables show up across Fabric workloads. The key idea is that Fabric can “see” the Iceberg tables through the catalog mirror rather than forcing another ingestion path, which is useful if Dremio already owns table layout, optimization, and governance.

On the discovery side, Fabric also introduced a preview OneLake Catalog Search REST API for finding items across workspaces by metadata, with the same capability wired into the Fabric core MCP server and exposed in Fabric CLI as fab find. For teams trying to scale ML across multiple domains and workspaces, the value is less time hunting for the right lakehouse/warehouse/semantic model, and more consistent ways to script discovery into tooling (including agentic workflows) using API calls and CLI filtering (including JMESPath).

Real-time pipelines in Fabric: SQL-based streaming plus better observability

Fabric's real-time tooling got a clearer “build it, test it, run it, monitor it” arc this week. The Eventstreams SQL operator reached general availability, positioning SQL as a first-class way to express streaming transforms while adding production-ready capabilities like multi-destination fan-out, built-in testing, and event-time processing. Event-time processing matters when late or out-of-order events are normal (common in IoT, clickstreams, and operational logs) because it lets you reason about “when it happened” instead of “when it arrived”, which can stabilize aggregations and windowed calculations that downstream ML features depend on.

At the same time, Eventstreams gained workspace monitoring in preview. Enabling it creates a managed monitoring Eventhouse and emits KQL tables that track node status, per-minute throughput, and per-minute error metrics. The guidance to republish existing Eventstreams to pick up monitoring is a practical detail for teams already running production pipelines: instrumentation is becoming part of the product surface, not a separate DIY logging project.

Fabric April 2026 update: MLflow logging, notebooks, and warehouse features that affect ML workflows

The April 2026 Fabric feature summary tied together several changes that land directly in day-to-day ML and analytics work. Fabric added VS Code-based workspace and environment management, which fits the broader theme of moving operational tasks closer to developer tooling. Notebooks picked up retry policies, a small-sounding change that can make scheduled training and feature engineering runs more resilient when transient failures happen.

On the ML lifecycle side, Fabric now supports MLflow cross-workspace logging (including OAP workspaces), which is useful when teams separate experimentation, shared model registries, and production workspaces for governance. Semantic Link (SemPy) advanced to 0.14.0 with admin APIs, which matters for teams automating semantic model management and connecting Power BI semantics to Python-driven analysis and ML feature work.

Data Warehouse improvements like transactional ALTER TABLE and COPY INTO support for JSONL also feed into ML pipelines, especially when teams stage semi-structured data and want predictable schema evolution and repeatable loads. Real-Time Intelligence updates (including Eventstream observability and an Eventhouse remote MCP) reinforce the push to make streaming systems easier to operate and easier to connect into automated workflows.

SQL development for Fabric: Azure Data Studio retirement and the move to VS Code

Fabric SQL developers got a clear direction: Azure Data Studio is retired, and the recommended path is VS Code with SQL Database Projects and the MSSQL extension. The emphasis is on adopting software-engineering workflows for database changes: Git-based source control, pull request reviews, schema compare, and publish script previews so teams can see what a deployment will do before it runs. For ML teams that manage feature-store-like tables or training data schemas in Fabric warehouses, this shift reduces “drift by manual edits” and makes schema changes auditable and reviewable.

The VS Code MSSQL extension's support for GitHub Copilot is positioned as a productivity boost inside the editor, and Microsoft also called out an ADS migration toolkit to help teams move existing setups rather than starting from scratch.

Other Machine Learning News

Fabric pipelines continued to shift from classic ETL toward broader workflow orchestration, with a preview Approval activity that enables human-in-the-loop steps (useful for governance gates like model sign-off, data access approval, or controlled production promotion), plus more focus on observability for long-running workflows.

A longer-horizon case study looked at a real fintech lending decisioning workload (weighted ensembles with explainability and auditability requirements) evaluated using Microsoft Analog Optical Computer (AOC) digital twins on Azure, offering an early signal of how alternative compute approaches might be tested against regulated ML scenarios before hardware is broadly available.

DevOps

This week in DevOps was about making the delivery pipeline more reliable end-to-end: GitHub shared what it is changing after recent availability incidents, while Microsoft and the community published practical guidance for scaling CI runners, modernizing infrastructure as code (IaC), and tightening up the tooling and documentation that keeps teams shipping.

GitHub platform reliability and operating at larger scale

GitHub detailed what it learned from two recent incidents and how it is reshaping the platform to reduce blast radius when things go wrong. A big theme was scale: GitHub plans to grow capacity by 10x to 30x, which forces hard decisions about where state lives, how services fail, and how quickly teams can recover. To keep outages from cascading, GitHub is focusing on stronger service isolation, explicitly calling out separation for Git itself and for GitHub Actions, so failures in one area do not automatically degrade another. It also described infrastructure work that includes continued migration onto Azure paired with a longer-term move toward multi-cloud, aiming to diversify dependencies and improve resilience options during incidents. On the transparency side, GitHub said it will improve how it communicates on the GitHub status page, with clearer, more timely updates so engineering teams can make faster decisions during disruptions (for example, whether to pause deploys, reroute builds, or switch to contingency workflows). That is a direct continuation of last week's focus on status interpretation and incident vocabulary (including “Degraded Performance” and per-service reporting), but here the emphasis shifts from how to read the status page to what GitHub is changing behind it so delivery systems are less likely to need fallbacks in the first place.

GitHub Actions: elastic, self-hosted runners on Azure Container Apps with KEDA

For teams that need self-hosted GitHub Actions runners (custom toolchains, network access, or compliance constraints) without paying the always-on cost, a new walkthrough showed how to run ephemeral runners on Azure Container Apps Jobs and scale them from zero using KEDA. The guide uses KEDA's GitHub runner scaler to watch queued workflows and spin up runner jobs only when demand exists, then scale back down when the queue clears, which can help reduce idle capacity while keeping throughput during bursts (like PR rushes or nightly builds). It walks through building and publishing a runner container image to Azure Container Registry (ACR), configuring the Container Apps Job with the right scaling rules, and securing the setup with Azure Key Vault plus Managed Identity so runner registration tokens and other secrets are not hardcoded into pipeline configs. If you have been managing VM-based runner fleets or static Kubernetes runners, this pattern is a useful reference for a lighter-weight, event-driven runner pool that still stays inside your Azure boundary. It also pairs naturally with last week's GitHub-side reliability and outage-readiness thread: if your contingency plan includes shifting critical builds to self-hosted capacity during GitHub-hosted runner disruption, an on-demand runner pool is a practical way to keep that option viable without running a large idle fleet.

Configuration management and IaC modernization: DSC v3.2.0 and AVM refactoring workflows

Microsoft Desired State Configuration (DSC) v3.2.0 reached GA with changes that target repeatability and safer rollouts. The release adds new built-in Windows resources, expanded WhatIf support so you can preview changes before applying them, and version pinning to reduce drift across environments. It also includes expression language enhancements and improvements to adapters/extensions (including PowerShell adapters), which matters if you are integrating DSC into heterogeneous automation stacks. One of the more forward-looking additions is experimental Bicep orchestration over gRPC, which hints at DSC being used as a coordination layer that can talk to IaC tools more directly instead of only running as local scripts.

In parallel, a separate guide tackled the common brownfield problem: refactoring legacy Terraform toward Azure Verified Modules (AVM) without creating a high-risk, big-bang rewrite. The proposed workflow uses AI to speed up audits, scaffold module migrations, and summarize Terraform plan diffs, but keeps humans responsible for validation and uses Terraform plan review plus policy gates as the safety rails. The emphasis is on making modernization repeatable: generate candidate changes, compare plan output, enforce checks (Azure Policy, tools like Checkov), and only then merge, which is the kind of process that scales across teams and repos when you are gradually standardizing on AVM. This connects cleanly to last week's Azure SRE Agent drift walkthrough: both treat “desired vs actual” as an operational loop, where previewing change (plans/WhatIf), correlating context, and leaving an auditable trail matter more than raw automation speed.

Maintainers, docs, and AI-in-the-loop workflows: making changes reviewable

As more contributions come from AI-assisted workflows (and sometimes from agents), GitHub used the kickoff of Maintainer Month 2026 to focus on what maintainers need to keep reviews efficient and safe. The talk highlighted repository-embedded instructions as a way to guide AI-generated contributions toward project conventions, and it called out a new reality for reviewers: when authorship is uncertain (human, copilot, or agent), reviewers often change how they assess risk, request tests, or demand explanation in the PR description. The practical takeaway is that maintainers may need to be more explicit about contribution expectations and automation requirements so reviews do not degrade as contribution volume increases. That continues last week's governance and review thread (Stacked PRs for smaller diffs, ruleset insights for enforcement and bypass visibility): as PR volume increases, projects are investing both in workflow mechanics and in clearer “how to contribute” signals so reviewers can keep throughput without lowering standards.

That theme connects directly to the “small things” that make collaboration work: Markdown. GitHub published beginner-friendly guidance on Markdown basics for READMEs, issues, PRs, and comments, and it also reiterated why Markdown is still worth learning for day-to-day engineering communication. If you are trying to improve review quality in an AI-assisted world, better structured PR descriptions, clear checklists, and readable docs are still some of the cheapest wins. It is the same practical documentation angle we touched last week with GitHub Pages onboarding and architecture-as-code in PRs: more of the delivery surface (docs, diagrams, ADRs) now lives inside repo workflows, so the baseline quality of Markdown and review context matters.

Other DevOps News

VS Code 1.119 (Insiders) continued to flesh out chat and agent workflows in ways that show up in daily DevOps tasks, like getting better context from codebases stored in virtual file systems, attaching browser tabs as context, and adding a permission flow so agents do not silently read from open tabs. It also adds Copilot CLI plan mode support, which is relevant if you use CLI-driven automation and want AI help that fits into a “plan then execute” workflow instead of directly changing state. This builds on last week's focus on agent containment and controls (worktree/Git isolation, persisted permission modes, and tighter terminal execution): the direction remains consistent, with more explicit permissions and more predictable agent behavior as these tools get used for real operational tasks.

Production pipelines that generate or transform documentation got a useful case study from Co-op Translator, which hardened its AI translation flow after community bug reports. The fixes focus on keeping Markdown structure intact (code fences, list-aware chunking), normalizing internal anchors, and handling CJK emphasis correctly, shipping in v0.18.1. If your CI publishes docs across languages, the lesson is clear: treat Markdown fidelity as a first-class quality signal and build structure-aware guards instead of relying on post-hoc manual edits. It is also a practical follow-on to last week's “docs as deliverables” theme (Pages + architecture-as-code): once docs are in CI, formatting regressions become production incidents in their own right.

KubeCon EU 2026 hallway themes showed where platform engineering attention is going: Gateway API migrations, confidential computing (including Confidential Containers), container signing (Notary Project), observability tooling, and interest in deeper Azure/AKS alignment and better support for AI workloads. If your roadmap includes tightening supply chain security or modernizing ingress, this recap is a good snapshot of what teams are actively asking vendors and maintainers to prioritize. The supply chain angle lands especially well next to last week's deployment-safety write-up (eBPF-based controls to avoid circular dependencies during outages): reliability work is increasingly tied to proving what runs (signing/attestation) and constraining how it behaves under failure.

Azure

Building on last week's “day-two readiness” thread (standard workflows, controlled transitions, and evidence-based troubleshooting), Azure’s story this week was about tightening control as Azure expands into more constrained environments. On one end, Azure Local and landing zone guidance leaned into disconnected and sovereign operations, while core platform services like Blob Storage, Azure Monitor, and AKS picked up practical updates that help teams scale securely, observe more precisely, and ship faster.

Azure Local and landing zones for sovereign and disconnected environments

Azure Local took a clear step toward larger, more flexible sovereign deployments, with Azure Local 2604 reaching GA as the first feature update of CY 2026. The headline is disaggregated deployments: instead of tightly coupling compute and storage, you can now attach SAN storage, including Fibre Channel support, which matters for customers standardizing on established storage stacks and needing to scale or refresh compute and storage independently. Microsoft also pushed hard on identity for regulated and disconnected scenarios by introducing GA Local Identity backed by Azure Key Vault, so Azure Local can be provisioned without relying on Microsoft Active Directory dependencies. That combination (SAN-based disaggregation plus Key Vault-backed local identity) is directly aimed at sites that cannot depend on continuous connectivity or centralized directory services, and it mirrors last week's recurring goal: remove brittle dependencies (directory, secrets, manual runbooks) that tend to surface during incidents and cutovers.

In parallel, governance guidance caught up. Azure Landing Zones (ALZ) added a new “Local” management group, positioning it as a clean place to organize Azure Local resources and to support disconnected-operations exit planning (called out as Azure Local disconnected operations, ALDO). For teams using Sovereign Landing Zone (SLZ), built-in policy initiatives were refreshed and mapped to L1/L2/L3 control tiers, with emphasis on residency and encryption requirements (including Customer-Managed Keys (CMK) and Confidential Computing-related controls). Put together, the platform changes and the governance updates form a more coherent path: deploy Azure Local at sovereign scale, then apply management group structure and policy guardrails that reflect how regulated environments actually operate, extending last week's “policy remediation over tickets” theme into the sovereign footprint.

Microsoft’s broader sovereign private cloud announcement rounded out the picture by highlighting that Azure Local deployments can now scale to thousands of servers within a single sovereign environment. The message here is less about a single feature and more about the operational envelope: mission-critical resiliency, disconnected operations, and the ability to run GPU-backed AI inference and analytics on customer-controlled infrastructure while still using Azure-style management and RBAC patterns. For architects, the practical takeaway is that Azure Local is being positioned not just for edge clusters, but for very large, isolated regional footprints with modern workload requirements, which sets context for this week's AKS AI guidance as “production patterns, but under stricter constraints.”

Azure Blob Storage security and access: SFTP host keys and prefix-scoped SAS

Blob Storage had two updates that both land squarely in the day-to-day reality of secure access. First, the SFTP endpoint host key change means teams that pin SSH trusted host keys need to update clients (or automation) to avoid sudden connection failures. The guidance focused on responding systematically: update known_hosts/trusted host key stores, then use Azure Resource Graph to discover which storage accounts have SFTP enabled, and Log Analytics queries (KQL) to identify SSH key-based clients so you can prioritize which integrations will break first. That inventory-and-evidence approach lines up with last week's incident-response framing (collect signals first, then change safely), and it matches the broader theme that “identity and access wiring” is often what turns a routine platform change into an outage.

Second, Azure Blob Storage made prefix-scoped access for User Delegation SAS generally available in all regions. Instead of issuing a SAS that covers an entire container, you can scope the token to a virtual directory (prefix) within that container, which is a straightforward least-privilege win for multi-tenant layouts and “one container, many teams” patterns. This echoes last week's direction toward tighter, auditable scopes (managed identities per connector, wildcard roles for constrained patterns, and policy-driven governance) by giving storage teams a practical middle ground between “one container per tenant” sprawl and overly broad tokens. The announcement reinforced the recommended access model (Microsoft Entra ID plus RBAC/ABAC) and showed how to express the directory scope through REST and .NET SDK parameters (including fields like sr=d and sdd). For developers building upload portals, data exchange drops, or per-customer paths, this reduces the need to mint separate containers just to keep SAS scopes tight.

AKS for AI workloads and network observability: reference architectures and GA filtering

AKS guidance this week leaned into two production pain points: running GPU-heavy AI systems reliably and keeping network telemetry useful (and affordable) at scale. This builds directly on last week's AKS operations arc (Gateway API migration planning, one-command backups, and evidence-driven network investigations) by shifting from “how to operate the cluster” to “how to run demanding workloads on it without losing control of ingress, identity, and observability.”

A new diffusion-model reference architecture laid out how to structure a cluster for mixed compute needs by separating CPU and GPU lanes, then choosing a dispatch pattern based on your workload shape. For simpler flows, Kubernetes-native dispatch can work, while queue-based patterns (Azure Service Bus plus KEDA) provide better control when you need buffering, back-pressure, or burst handling. The architecture also emphasized production plumbing that often gets skipped in AI demos: secure ingress, durable storage for generated outputs and model caches, and identity patterns like Microsoft Entra Workload ID paired with Azure Key Vault for secret and credential management. For observability, it called out combined application and GPU telemetry using tools like Application Insights and Azure Managed Prometheus so you can correlate request-level behavior with accelerator saturation and scheduling effects, reinforcing last week's point that “deployed” is not the same as “ready for cutover” when real traffic and dependencies show up.

On the networking side, Advanced Container Networking Services (ACNS) observability features for AKS moved to GA with capabilities designed to reduce noise while preserving detail where it matters. Last week introduced the Container Network Insights Agent as an advisory, read-only way to pull together CoreDNS, policies, Cilium/Hubble flows, and host signals into an auditable report. This week complements that “investigate precisely” story with “collect sanely”: on-node container network metrics filtering is now available, along with container network log filtering and 30-second flow log aggregation. That gives platform teams a lever to control telemetry volume without fully turning off high-cardinality signals. Logs land in Log Analytics under the ContainerNetworkLogs table, and the design supports exporting to external tools like Splunk or Datadog when Log Analytics is not the final destination. Under the hood, the announcement referenced a Cilium/Hubble-based model and surfaced Kubernetes custom resources (CRDs) such as ContainerNetworkMetric and ContainerNetworkLog, which is useful because it frames network observability as declarative cluster configuration rather than a one-off agent tweak.

Together, these updates show Azure’s direction for AKS: provide opinionated patterns for AI productionization, then back them with more tunable, Kubernetes-native observability controls so teams can run larger fleets without drowning in logs, while staying aligned with the migration-and-deprecation clocks called out last week (ingress controllers and log ingestion).

Reliability engineering in Azure Monitor: SLIs and SLOs in public preview

Azure Monitor introduced public preview support for Service Level Indicators (SLIs) and Service Level Objectives (SLOs), pulling more of the SRE workflow into native Azure tooling. Last week, Azure SRE Agent expanded into first-party Log Analytics and Application Insights connectors so investigations can run KQL directly through MCP-backed tools, keeping identity scopes tight and actions read-only. This week moves the workflow one step earlier in the lifecycle: define what “good” looks like (SLIs/SLOs), then let error budgets and burn rates drive when you page and what you investigate.

The preview focuses on practical mechanics: author SLIs directly, establish baselines, track error budgets, and alert using burn-rate logic so teams get notified when they are spending their budget too quickly rather than reacting only after an outage is obvious. The emphasis on “Service Group” level reporting is important for teams that operate systems composed of multiple services and want a combined reliability view instead of piecemeal per-resource alerts.

The implementation detail to note is that this builds on Azure Monitor metrics stored in an Azure Monitor Workspace, which ties into how teams already centralize metrics for scenarios like Managed Prometheus and OpenTelemetry pipelines. For developers and operators, the near-term value is less about new charts and more about turning reliability targets into first-class configuration, then letting error-budget math drive alerting and escalation, which fits the broader “reduce toil through standard workflows” storyline from last week.

Azure Functions and Service Bus: deeper troubleshooting for trigger reliability

A detailed troubleshooting guide for Azure Functions Service Bus triggers focused on the real failure modes teams see in production, especially when using PeekLock processing. This connects cleanly to last week's Service Bus scaling pattern (avoiding hidden ceilings like session lock affinity) by zooming in on the other side of the same reliability problem: once you choose a messaging pattern, you still need deterministic trigger behavior under load, retries, and transient auth/network issues.

The write-up walked through diagnosing connection and authentication failures (including Managed Identity and Azure RBAC considerations), lock loss during message handling, dead-letter queue (DLQ) behavior, and the kinds of issues that create duplicate processing. It also covered scaling dynamics (including target-based scaling), sessions, and lower-level AMQP or network problems that can look like intermittent trigger flakiness.

What makes this useful is the emphasis on how to connect configuration and diagnostics. It points developers to tune and validate behavior through host.json, then verify hypotheses using Azure diagnostics and Application Insights, rather than guessing based on symptoms. If you run Functions as part of an event-driven system, the practical outcome is faster root cause isolation: you can distinguish “we are not receiving messages” from “we are receiving but failing to settle locks” from “we are processing twice due to retries and timeouts”, and then choose fixes that match the underlying cause.

Kubernetes-native database platforms: Crossplane with Azure Database for PostgreSQL

A Kubernetes-first pattern for building an internal DBaaS on Azure showed how Crossplane can provision and manage Azure Database for PostgreSQL Flexible Server while keeping developer workflows inside Kubernetes. It lands on the same operational pressure point called out last week in the Azure networking section: DNS and Private Link wiring often decides reliability. Here, the design leans into that reality instead of treating it as an afterthought, using private networking via Azure Private Endpoint, service discovery using Azure Private DNS, and DNS-based read/write endpoints so applications can connect without embedding failover logic everywhere.

For HA/DR, the design described a multi-region active-passive setup using replicas with manual promotion, which is a common choice when teams want clear operational control during regional incidents rather than automatic cross-region failover surprises. It also highlighted using Azure Traffic Manager in the overall topology to route clients appropriately. For platform teams, the main implication is that Crossplane can act as the control plane for database lifecycle (provisioning, configuration, and standardization) while Azure PostgreSQL remains the managed data plane, giving you a consistent Kubernetes API surface without taking on the burden of running PostgreSQL clusters yourself. This also pairs naturally with last week's PostgreSQL “run it well today vs what's next” split by showing a concrete platform approach you can apply to Flexible Server now, even as HorizonDB messaging develops.

Other Azure News

Azure Developer CLI (azd) shipped five releases in April 2026, with notable improvements for teams standardizing deployments through azure.yaml. This is a continuation of last week's azd thread (a single azd update regardless of install method, plus stable vs daily channels) by reinforcing the same operational goal: make developer tooling upgrades predictable so environment drift does not become another hidden reliability tax. Multi-language hooks now cover Python, JavaScript/TypeScript, and .NET, the extension framework was enhanced, and Copilot-assisted troubleshooting was improved. The release notes also called out security and reliability fixes, including MSI code-signing verification, plus ongoing enhancements around Bicep, updates, and Key Vault secret resolution.

John Savill’s May 1, 2026 Azure update rounded up a broad set of platform changes, including AKS networking enhancements (notably WireGuard in-transit encryption), Azure Front Door WAF HTTP DDoS protections, Azure Elastic SAN updates, and PostgreSQL cascading read replicas. On the AI platform side, it flagged Microsoft Agent Framework 1.0 reaching GA and the retirement of Prompt flow, which is worth tracking if you have agent workflows built on Azure’s current tooling.

.NET

This week in .NET was a mix of platform plumbing and practical building blocks: Microsoft pushed forward on modernizing the toolchain (especially inside Visual Studio), while several posts showed how .NET 10+ apps are increasingly composed from focused libraries for AI, caching, and API surface management. Coming right after last week's split between “install the preview” (.NET 11 Preview 3) and “patch production now” (April 2026 servicing), the throughline is familiar: the platform keeps tightening defaults (dependencies, provenance, project systems), and teams need to validate those shifts early to avoid surprises later. At the same time, a couple of changes signaled where the ecosystem is heading next, including a notable test platform dependency shift that could surface as a breaking change in CI.

Visual Studio and the .NET toolchain are getting more “SDK-style” (with a few sharp edges)

Visual Studio 18.5 added official SDK-style project support for VSSDK-based VSIX extension projects, which is a meaningful quality-of-life improvement if you maintain extensions that have historically lived in older project system land. The headline is that SDK-style enables better incremental builds, including the Fast Up To Date Check, so inner-loop iteration on extensions should feel closer to modern .NET projects. Microsoft also shipped updated templates and provided a short migration checklist to help existing extensions move over without guesswork, which matters because extension projects tend to accrete custom MSBuild logic over time. Seen next to last week's SDK/CLI “inner loop” focus in .NET 11 Preview 3 (dotnet watch resiliency, CLI edits for solution filters, and other iteration tweaks), this is another step in the same direction: more of the ecosystem moving to the newer project system conventions so tooling can be faster and more predictable.

In the same “tooling modernization” vein, VSTest announced it is removing its Newtonsoft.Json dependency starting in .NET 11 Preview 4 and Visual Studio 18.8. The new default is System.Text.Json, while .NET Framework scenarios will use JSONite. This is partly motivated by keeping dependencies tighter and responding to ecosystem pressure around NuGet vulnerabilities, but it has real compatibility implications: adapters, collectors, and any test infrastructure that assumed Newtonsoft.Json was present transitively may fail at runtime. The guidance is straightforward but important for extension authors and test tooling maintainers: explicitly reference Newtonsoft.Json if you still need it, and check any serialization behaviors that change when moving to System.Text.Json. It is also a clear continuation of last week's “keep your inputs trustworthy” theme (signed official .NET container images, servicing patches, and deadlines that force modernization): the platform is steadily removing implicit assumptions and nudging teams toward explicit dependencies and maintained defaults.

SkiaSharp 4.0 Preview 1 modernizes graphics primitives and expands targets

SkiaSharp 4.0 Preview 1 landed with a jump to Skia milestone 147 and a focus on newer typography and drawing APIs. Variable fonts (OpenType variable fonts) and color font palettes are the kind of features you only notice once you try to render modern design systems across platforms, and having them supported at the library level reduces the need for platform-specific workarounds. The preview also introduces SKPathBuilder, which should make constructing complex paths more ergonomic and potentially less error-prone than manual path mutation patterns.

On the platform side, the release adds new native targets and ships an interactive gallery implemented with Blazor WebAssembly, which is a practical way to let developers verify rendering behavior quickly without pulling down a full sample solution. The post also calls out Uno Platform co-maintenance, continuing the theme that SkiaSharp sits at the intersection of Microsoft and cross-platform UI communities. With last week's .NET 11 Preview 3 spending time on browser/WASM packaging and debugging (including WebCIL) and on UI ergonomics like Blazor Virtualize improvements, SkiaSharp's Blazor WASM gallery fits the broader pattern: .NET-in-the-browser tooling is becoming a more normal part of how libraries demonstrate and validate cross-platform behavior.

Composable AI in .NET: a reference app that stitches together the stack

ConferencePulse is a good example of how Microsoft wants teams to build AI features in .NET: not via one monolithic framework, but by composing smaller pieces. The tutorial walks through using Microsoft.Extensions.AI as the application-facing abstraction, then layering in DataIngestion and VectorData for retrieval-augmented generation (RAG)-style Q&A. It also pulls in MCP (Model Context Protocol) and the Microsoft Agent Framework to drive tool-based workflows like poll generation, plus multi-agent session summaries and real-time insights.

What makes the walkthrough useful is that it is not only about prompting. It treats ingestion, vector storage, and tool execution as first-class parts of the app, and it runs under .NET Aspire with Azure OpenAI, so you can see how the pieces hang together in a real deployment shape (including OpenTelemetry for observing the system). If you're deciding whether to adopt these new “Extensions.*” AI packages, this is one of the clearer end-to-end examples of how they are intended to be wired. It also mirrors last week's preview theme of “small primitives that reduce custom glue” (for example, more built-in knobs in System.Text.Json and other BCL areas): the composable approach is showing up across the stack, not just in AI.

Performance and interop patterns: tiered caching and Native AOT DLL exports

On the performance side, Microsoft demonstrated tiered caching in a .NET 10 app using HybridCache with Azure Database for PostgreSQL as the distributed cache (via Microsoft.Extensions.Caching.Postgres). The core idea is to combine an in-memory layer for fast hits with a shared Postgres-backed layer for cross-instance reuse, then measure the difference between cold and warm paths. The tutorial stays grounded in implementation details you actually need in a real service: configuration, secrets handling with dotnet user-secrets, and a simple benchmark setup that highlights lower latency on cache hits once the pipeline is warmed. Last week's .NET 11 Preview 3 leaned heavily into runtime/JIT optimizations and hot-path wins without code changes, and this caching walkthrough complements that story from the app architecture side: even as the runtime trims overhead, teams still get big, measurable wins by designing for warm paths and shared state across instances.

For interop-heavy teams, Rick Strahl showed how .NET Native AOT can publish a Windows-native WinAPI-style DLL from a .NET 10+ class library, including exporting functions via UnmanagedCallersOnly and publishing with dotnet publish. The guide is candid about constraints: when you want to be callable from non-.NET code (like older tools such as FoxPro), you have to think in terms of stable exported entry points, StdCall conventions, and limited marshalling scenarios. The payoff is a deployment artifact that looks like a traditional native DLL while still being authored in C#. In a week where VSTest is explicitly changing transitive dependencies (and last week emphasized keeping containers and servicing baselines current), Native AOT is another example of the same practical pressure: be explicit about what you ship, what you depend on, and what contracts you expose, because “it was there implicitly” is becoming less reliable over time.

API surface and reliability: versioned OpenAPI and the outbox pattern, revisited

API versioning continues to be a practical concern for teams shipping public endpoints, and the .NET 10-focused guide on combining Asp.Versioning v10 with OpenAPI shows a clean pattern: generate one OpenAPI document per version and expose them as versioned endpoints like /openapi/v*.json. The post covers both controllers and Minimal APIs, then shows how to wire SwaggerUI or Scalar so consumers can browse the right contract without guessing which version a given schema belongs to. For teams trying to keep backward compatibility without freezing their API design, “one OpenAPI per version” tends to be easier to maintain than a single blended spec that grows conditional logic. It also ties back to last week's reminder that frameworks and timelines move on (like the ASP.NET Core on .NET Framework end-of-support date): clear, versioned contracts make it easier to evolve APIs while you modernize runtimes and hosting models underneath.

Reliability in distributed systems showed up as well via On .NET Live, where Joao Antunes revisited the transactional outbox pattern using his OutboxKit toolkit. The discussion centers on the core trade-off the pattern is meant to address (atomicity between database state changes and message publishing), what tends to go wrong in real implementations, and alternative approaches when the operational cost of an outbox is too high for a given system.

Other .NET News

Visual Studio 2026 18.6 Insiders 3 now enables the TypeScript 7 Beta native preview by default, prioritizing compiler and language service performance while calling out feature gaps and known issues you will want to validate against your solution (especially if you rely on specific editor behaviors). The update also intersects with how teams manage project-local TypeScript via npm, since the “native language service preview” changes the default experience inside the IDE. It lands in the same broader “defaults are changing” bucket as the VSTest JSON dependency shift: IDE and tooling updates can quietly alter day-to-day behavior, so pinning versions and validating upgrades in CI matters.

Rick Strahl also revisited C# scripting and templating via Westwind.Scripting, explaining how the ScriptParser template engine works with a Handlebars-style syntax, Roslyn-based runtime compilation, caching, and newer layout/section support for file-based templates. If you maintain apps that generate text (emails, code, reports) and want a lightweight embedded templating approach without standing up a separate service, this is a useful tour of the architecture and trade-offs.

Security

Security news this week focused on two parallel pressures teams are feeling right now: urgent patch-and-harden work for high-impact vulnerabilities in core dev and runtime infrastructure, and the fast-moving reality that AI agents are becoming part of the attack surface. Across Microsoft and GitHub updates, the practical theme was governance (who can call what, when, and with what audit trail) paired with stronger identity and data protections that reduce blast radius when something does go wrong. That threads cleanly into last week's direction: reduce ambient privilege, remove long-lived secrets, and make secure defaults workable at scale, because when an incident starts from “normal” workflows, your margin often comes from consistent guardrails and fast containment.

GitHub and Linux: High-impact CVEs that hit the developer pipeline and cloud workloads

GitHub disclosed and remediated a critical remote code execution issue in the git push path (CVE-2026-3854), rooted in unsanitized push options being written into internal metadata. The key takeaway for teams running GitHub Enterprise Server is that this is not just a theoretical edge case in Git plumbing. Push options can be supplied by clients during normal workflows, so administrators need to treat the update as an urgent pipeline security fix: patch quickly, then follow GitHub's operational guidance to review logs for suspicious activity and validate that the mitigations are in place across all nodes. In practice, it is the same “trusted workflow abuse” lesson we saw last week in the Teams/Quick Assist intrusion research: the entry point can look like normal user or developer behavior, so hardening and detection around everyday paths matters as much as perimeter controls.

On the runtime side, Microsoft detailed a high-severity Linux kernel local privilege escalation dubbed “Copy Fail” (CVE-2026-31431) that can enable root escalation across cloud environments, including Kubernetes-heavy deployments. The write-up highlights how the vulnerability can matter even when perimeter controls look good, because an attacker who lands code execution in a container or workload may be able to escalate locally via the kernel, then pivot further. Microsoft pairs the disclosure with mitigation guidance and Microsoft Defender XDR detections so security teams can hunt for exploitation signals while patching rolls out (especially important for fleets spanning multiple distros and managed Kubernetes nodes). This complements last week's identity-first and “tighten trust boundaries” theme: Workload Identity and OIDC reduce credential theft risk, but kernel and platform patching still determines how far an attacker can go once they are inside.

Governing AI agents and Copilot extensibility: shifting controls left, and dealing with the gaps

Several posts this week converged on the same uncomfortable point: once you let agentic tools call external “tools” (shell commands, cloud APIs, ticketing systems, MCP servers), you have effectively created a new integration surface that needs the same rigor as any production API. The Agent Governance Toolkit (AGT) for .NET tackles this directly for Model Context Protocol (MCP) tool calls by adding policy-based controls, scanning tool definitions, sanitizing responses, and producing audit and telemetry aligned to the OWASP MCP Top 10. For .NET teams building or embedding agents, the value is that governance can live in the same place you already instrument services, including OpenTelemetry pipelines, rather than being an afterthought bolted onto the agent runtime. This is a direct continuation of last week's agent security thread (for example, the Secure Code Game's “ProdBot” scenarios): the threat model is not abstract prompt injection anymore, it is tool access, memory, and cross-system side effects.

The companion “shift-left” governance guidance makes AGT feel less like a runtime gate and more like a SDLC pattern: enforce pre-commit hooks and pull request gates, verify governance in CI, and carry controls through release with artifacts like SBOMs, signing, and provenance attestations. That approach addresses a recurring failure mode with agent integrations: teams discover risky tool wiring only after an incident or after an agent starts acting on production systems. It also mirrors last week's broader supply-chain operations push (SBOM export reliability, org-wide scanning baselines): governance only works when it is repeatable, automated, and visible across the estate.

At the same time, an independent analysis of GitHub Copilot extensibility surfaces lays out where enterprise controls can fall short in practice. It walks through five extension points (Copilot CLI plugins, Microsoft APM, gh skill, MCP servers, and VS Code extensions) and shows how governance can be bypassed depending on how plugin sources are pinned, where policy files are enforced, and what is (or is not) audited. The most actionable parts are the mitigation patterns: pin and lock plugin sources, apply strict policy configurations where supported, and treat “extension registries” as a supply chain surface that needs explicit allowlists and monitoring. That is the same operational reality last week highlighted in a different form: “secure defaults” and centralized controls help, but you still need to find the escape hatches (forks, plugins, unmanaged tools) and close them with policy plus monitoring.

Finally, Microsoft's 1ES team described how they are using agentic AI internally to reduce time-to-remediate for CVEs and compliance work while keeping humans accountable for review and deployment. Their approach is intentionally operational: GitHub Copilot CLI plus reusable Markdown “skills” to standardize work, and “agent signals” to help track what the agent did and why. It is a useful model for organizations trying to scale remediation without turning agents into unreviewed automation, and it connects directly to the governance story: the more you delegate, the more you need consistent guardrails and traceability. It also pairs with last week's AI incident response guidance: if you cannot reconstruct actions and decisions from telemetry, you cannot contain fast or learn reliably.

Microsoft security platform and cloud protections: identity-first pipelines, multicloud visibility, and data-in-use safeguards

On the identity and infrastructure side, guidance for Terraform pipelines on Azure pushed a clear best practice: stop relying on long-lived client secrets in CI/CD, and move to OIDC Workload Identity Federation through Microsoft Entra ID for both GitHub Actions and Azure DevOps. The practical implementation details matter here, including using a user-assigned managed identity (UAMI), creating federated identity credentials tied to your CI provider, and tightening RBAC to least privilege. The post also calls out state hardening (where compromise often starts) and migration steps so teams can move incrementally without breaking delivery. This is the same storyline as last week's GitHub and Azure updates around OIDC (for Dependabot/code scanning registries and AKS Workload Identity): tokenless-by-default pipeline access is becoming the baseline pattern, and the operational work shifts to scoping identities well and monitoring token/session behavior.

In detection and response, Microsoft Sentinel UEBA updates show how teams defending AWS can reduce complexity by enriching AWS CloudTrail with behavioral signals (via the BehaviorAnalytics and Anomalies tables) so detections can be written in simpler KQL without reconstructing every edge case from raw logs. The examples focus on common attacker paths in AWS environments - federated identity abuse, suspicious IAM changes, secrets access, and S3 exfiltration - and the operational win is faster triage when the system can surface ML-driven anomalies instead of forcing analysts to handcraft brittle thresholds. This follows naturally from last week's emphasis on “operate it well”: better baselines and higher-quality signals reduce the time spent chasing noise, especially when attacker behavior blends into legitimate admin activity.

For data protection, Azure Event Hubs Dedicated added confidential computing support that protects streaming data while it is being processed (data in use) using trusted execution environments (TEEs), and it does so without requiring application changes. The announcement pairs this with practical defense-in-depth steps teams can layer on top: Entra ID for access control, customer-managed keys (CMK) backed by Azure Key Vault Managed HSM, private networking, and Azure Policy to enforce configuration standards across clusters. It also connects back to last week's cryptography inventory guidance: TEEs and CMK help, but teams still need a clear picture of where keys live, which services use them, and how policy and monitoring prevent drift over time.

Other Security News

Email threats in Q1 2026 continued to lean on interaction traps rather than malware-heavy payloads, with growth in QR code phishing and CAPTCHA-gated phishing that tries to evade automated scanning. Microsoft's analysis also covers the impact of Tycoon2FA disruptions on adversary-in-the-middle (AiTM) activity, then maps mitigations and detections across Microsoft Defender for Office 365 and Microsoft Defender XDR, including how Microsoft Security Copilot can support investigation workflows. That fits with last week's threat research theme: attackers keep winning initial access by abusing trusted UX (collaboration invites, remote help, “prove you're human” CAPTCHA flows), so the practical response is improved detection plus rehearsed containment and session revocation when users inevitably click through.

Microsoft also shipped a broader set of platform updates, including preview protections for AI agents through the Agent 365 tooling gateway, general availability integration between Defender for Cloud and GitHub Advanced Security, and a new Microsoft Purview Data Security Investigations demo aimed at helping teams validate investigation flows end to end. Agent 365 itself reached general availability, with an emphasis on discovering and governing “shadow AI” agents across endpoints, SaaS, and multicloud, and deeper integrations into Microsoft Defender, Intune, and Entra network controls. This is the platform-level continuation of last week's agent governance and AI incident response guidance: inventory, policy, and audit trails are moving into the same management planes teams already rely on for identity and endpoint control, which is where governance becomes enforceable rather than advisory.