⚠️ The Hidden Danger of Connection Pooling: Why Your FastAPI Crashes Under Load
Published: April 9, 2026 · 8 min read · By PyVerse (Anupam Dutta)
1. 🚨 I learned this the hard way
“More connections = faster API” — FALSE.
In development, with low traffic, everything was smooth. Then real production traffic hit, and within minutes:
- 💥 Random
too many connectionserrors - 💥 API endpoints started timing out
- 💥 Latency increased by 10x instead of improving
Formula:
workers × (pool_size + max_overflow) ≤ max_connections - reserved_connections
2. ⏱️ The true cost of a database connection
Every session.execute() without a warm pool establishes a new TCP connection – 5 distinct phases each adding latency.
📡 Detailed latency breakdown
- TCP 3-way handshake – 1-3ms (local) / 20-100ms (cross-region)
- TLS/SSL handshake – 5-30ms
- PostgreSQL auth (SCRAM-SHA-256) – 2-5ms
- Session init (fork +10MB) – ~2ms
- asyncpg driver setup – 1-3ms
🧠 PostgreSQL per-connection cost
- Each connection spawns a new OS process
- ~10MB memory + work_mem
- Context switching overhead
3. 🔌 SQLAlchemy connection pooling deep dive
SQLAlchemy’s QueuePool maintains persistent connections. Misconfigured pool_size and max_overflow cause connection storms.
from sqlalchemy.ext.asyncio import create_async_engine
engine = create_async_engine(
"postgresql+asyncpg://user:pass@localhost/db",
pool_size=5, # persistent connections
max_overflow=10, # temporary connections
pool_timeout=30,
pool_pre_ping=True,
pool_recycle=1800
)
max_connections=80 → crash.pool_size = (max_connections / workers) * 0.7, max_overflow = (max_connections / workers) * 0.3.4. ⚡ PgBouncer + asyncpg: the edge (and pitfalls)
For many pods (K8s), a centralized pooler like PgBouncer is essential. Transaction pooling mode requires special care.
- Disable SQLAlchemy internal pooling → use
NullPool - Set
prepared_statement_cache_size=0(asyncpg) - Turn off JIT for OLTP →
jit=off
from sqlalchemy.pool import NullPool
engine = create_async_engine(
"postgresql+asyncpg://user@pgbouncer:6432/db",
poolclass=NullPool,
connect_args={
"timeout": 10,
"command_timeout": 5,
"server_settings": {"jit": "off"},
"prepared_statement_cache_size": 0
}
)
prepared_statement_cache_size=0? Prevents “prepared statement does not exist” errors when PgBouncer routes to different backends.5. 🛡️ Circuit breaking + retry patterns
Database restart or overload causes every request to wait → cascading failure. Circuit breaker fails fast.
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type
@retry(wait=wait_exponential(multiplier=1, min=1, max=10),
stop=stop_after_attempt(3),
retry=retry_if_exception_type((OperationalError, ConnectionRefusedError)))
async def execute_with_retry(session, query):
return await session.execute(query)
pool_pre_ping=True to detect stale connections (~1ms overhead).6. 📊 Architecture decision matrix & final checklist
| Scenario | Pool class | prepared_statement_cache_size | JIT | PgBouncer mode |
|---|---|---|---|---|
| Direct PostgreSQL (few workers) | QueuePool | default (1024) | on (or off for OLTP) | N/A |
| High concurrency / many pods | NullPool ✅ | 0 ✅ | off ✅ | transaction pooling |
| Serverless / Lambda | NullPool | 0 | off | transaction pooling |
max_connections = employees. Each worker branch needs permanent + temporary staff. Total demand ≤ capacity.- Know your database
max_connections(ask DBA) - Reserve 10-20% for admin tasks
- Calculate:
max_connections - reserved ≥ workers × (pool_size + max_overflow) - With PgBouncer: use
NullPool,prepared_statement_cache_size=0,jit=off - Enable
pool_pre_ping=Trueand timeouts
📖 Quick glossary
💡 Feedback welcome: If anything is incorrect, please let me know — I’m here to learn and improve.
🔗 Full deep dive (PDF + diagrams): Google Drive link
Comments
Post a Comment