Modelling Postgres Performance Degradation on Burstable Cloud Instances | POSETTE 2026
Chun Lin Goh presents a POSETTE 2026 talk on how PostgreSQL performance can degrade on burstable cloud instances (for example, Azure B-series and AWS T-series) due to CPU credit depletion, and how to model the saturation point using a simple simulation approach.
Overview
What problem this talk addresses
Many teams run PostgreSQL on burstable instances to reduce cost. These instances use a CPU credit model:
- When credits are available, the VM can burst above its baseline CPU performance.
- When credits are depleted, the cloud provider throttles CPU back to the base frequency.
The key risk described is not a hard crash, but throughput exhaustion:
- PostgreSQL is not aware of the external CPU throttling.
- It continues accepting connections and work at a rate that no longer matches the throttled CPU capacity.
- This can trigger a cascading failure pattern:
- connection pools saturate
- p99 latencies spike
- application requests time out
- the database appears “online” but is effectively unavailable
Core concepts covered
- How CPU credits and bursting work on burstable instance families
- A simple explanation of the token bucket model (as an intuition for credit-based bursting)
- The “hidden risk” of CPU credit depletion and why it creates non-linear performance behavior
Approaches compared
The talk contrasts different ways to avoid getting surprised by throttling:
- Overprovisioning (buying more baseline capacity than needed)
- Load testing (often expensive to run realistically)
- Simulation as a fast decision filter to estimate when the system will saturate
Simulation approach and what it’s used for
Goh introduces simulation as a way to estimate a system’s Base Performance Ceiling (the sustainable throughput once throttled), without requiring large-scale load-testing infrastructure.
Topics called out in the session outline include:
- Discrete-event simulation fundamentals
- Building a simulation library in .NET
- A demo focused on:
- predicting CPU credit drain and failure conditions
- identifying a safer connection pool size for the throttled state
Cloud differences and operational visibility
- A comparison of Azure vs AWS burstable instance differences (at a conceptual level)
- The role of observability and real-time insight, with Grafana referenced for monitoring/visualization
Takeaways highlighted
- Burstable instances can be cost-effective, but the CPU credit model can create sharp performance cliffs.
- Modelling the throttled steady-state helps teams right-size instances and tune connection pools before production incidents.
- Simulation can be used as a faster, more data-driven filter than relying solely on overprovisioning or expensive load tests.