The Hidden Danger of PostgreSQL Connection Pooling: Why Your FastAPI Crashes Under Load

⚠️ The Hidden Danger of Connection Pooling: Why Your FastAPI Crashes Under Load

🔥 production war story 🐘 PostgreSQL ⚡ FastAPI 🔁 asyncpg

Published: April 9, 2026 · 8 min read · By PyVerse (Anupam Dutta)

1. 🚨 I learned this the hard way

“More connections = faster API” — FALSE.

In development, with low traffic, everything was smooth. Then real production traffic hit, and within minutes:

💥 Random too many connections errors
💥 API endpoints started timing out
💥 Latency increased by 10x instead of improving

🧠 The mindset shift: Stop asking “How many connections do I want?” → “How many can my database safely handle?”
Formula: workers × (pool_size + max_overflow) ≤ max_connections - reserved_connections

📄 Full deep dive (Google Drive) →

2. ⏱️ The true cost of a database connection

Every session.execute() without a warm pool establishes a new TCP connection – 5 distinct phases each adding latency.

📡 Detailed latency breakdown

TCP 3-way handshake – 1-3ms (local) / 20-100ms (cross-region)
TLS/SSL handshake – 5-30ms
PostgreSQL auth (SCRAM-SHA-256) – 2-5ms
Session init (fork +10MB) – ~2ms
asyncpg driver setup – 1-3ms

Total: 8-17ms (local) up to 128-273ms (cross-region). A simple SELECT is ~0.2ms → overhead 20-500x larger.

🧠 PostgreSQL per-connection cost

Each connection spawns a new OS process
~10MB memory + work_mem
Context switching overhead

⚠️ 500 concurrent connections → 5GB+ RAM just for connection overhead.

3. 🔌 SQLAlchemy connection pooling deep dive

SQLAlchemy’s QueuePool maintains persistent connections. Misconfigured pool_size and max_overflow cause connection storms.

from sqlalchemy.ext.asyncio import create_async_engine

engine = create_async_engine(
    "postgresql+asyncpg://user:pass@localhost/db",
    pool_size=5,           # persistent connections
    max_overflow=10,       # temporary connections
    pool_timeout=30,
    pool_pre_ping=True,
    pool_recycle=1800
)

pool_size (5)

active

max_overflow (10)

⚠️ 4 workers × (10 + 15) = 100 connections. If DB max_connections=80 → crash.

✅ Formula: pool_size = (max_connections / workers) * 0.7, max_overflow = (max_connections / workers) * 0.3.

💡 Key insight: Each Uvicorn worker has its own pool → total connections = workers × (pool_size + max_overflow).

4. ⚡ PgBouncer + asyncpg: the edge (and pitfalls)

For many pods (K8s), a centralized pooler like PgBouncer is essential. Transaction pooling mode requires special care.

⚠️ Critical: With PgBouncer in transaction pooling:

Disable SQLAlchemy internal pooling → use NullPool
Set prepared_statement_cache_size=0 (asyncpg)
Turn off JIT for OLTP → jit=off

from sqlalchemy.pool import NullPool

engine = create_async_engine(
    "postgresql+asyncpg://user@pgbouncer:6432/db",
    poolclass=NullPool,
    connect_args={
        "timeout": 10,
        "command_timeout": 5,
        "server_settings": {"jit": "off"},
        "prepared_statement_cache_size": 0
    }
)

❓ Why prepared_statement_cache_size=0? Prevents “prepared statement does not exist” errors when PgBouncer routes to different backends.

5. 🛡️ Circuit breaking + retry patterns

Database restart or overload causes every request to wait → cascading failure. Circuit breaker fails fast.

from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type

@retry(wait=wait_exponential(multiplier=1, min=1, max=10),
       stop=stop_after_attempt(3),
       retry=retry_if_exception_type((OperationalError, ConnectionRefusedError)))
async def execute_with_retry(session, query):
    return await session.execute(query)

🔁 Combine with pool_pre_ping=True to detect stale connections (~1ms overhead).

6. 📊 Architecture decision matrix & final checklist

Scenario	Pool class	prepared_statement_cache_size	JIT	PgBouncer mode
Direct PostgreSQL (few workers)	QueuePool	default (1024)	on (or off for OLTP)	N/A
High concurrency / many pods	`NullPool` ✅	`0` ✅	`off` ✅	transaction pooling
Serverless / Lambda	NullPool	0	off	transaction pooling

🏢 The office analogy: max_connections = employees. Each worker branch needs permanent + temporary staff. Total demand ≤ capacity.

✅ Final checklist:

Know your database max_connections (ask DBA)
Reserve 10-20% for admin tasks
Calculate: max_connections - reserved ≥ workers × (pool_size + max_overflow)
With PgBouncer: use NullPool, prepared_statement_cache_size=0, jit=off
Enable pool_pre_ping=True and timeouts

📖 Quick glossary

QueuePool NullPool max_overflow prepared_statement_cache_size Circuit Breaker JIT PgBouncer

💡 Feedback welcome: If anything is incorrect, please let me know — I’m here to learn and improve.

🔗 Full deep dive (PDF + diagrams): Google Drive link

#PostgreSQL #FastAPI #ConnectionPooling #PerformanceTuning #BackendEngineering #PyVerse #ByteForge

ByteForge

Search This Blog

Understanding Asyncio Internals: How Python Manages State Without Threads