Scaling & Multi-Instance¶
A single 4-vCPU / 8 GB Doable host comfortably serves 100β500 active users. Beyond that, scale horizontally.
What scales horizontally¶
| Service | Replicas | Notes |
|---|---|---|
web (Next.js) |
unlimited | Stateless. Put a load balancer in front. |
api (Hono) |
2+ recommended | Stateless once Redis is enabled. Each replica still spawns its own Vite dev servers β see "Project affinity" below. |
ws (WebSockets) |
1 (typical), N with sticky sessions | Yjs rooms are in-memory. Sticky sessions or a Y-redis adapter are needed to scale a single hot room across replicas. |
What does NOT scale horizontally (without work)¶
- Per-project Vite dev servers β they live on whatever API replica first touched the project. Use project β replica affinity at the load balancer (consistent hashing on
projectId). PROJECTS_ROOT/β must be shared across API replicas. Use NFS, EFS, or a shared block volume.- PostgreSQL β scale vertically or use a managed service (Neon, RDS, Supabase).
Topology for 1000+ users¶
ββββββββββββββββ
β Load balancerβ (nginx/Cloudflare/ALB)
ββββ¬ββββ¬ββββ¬βββ
β β β
ββββββββββββββ β βββββββββββββββ
βΌ βΌ βΌ
βββββββ βββββββ βββββββ
β web β β¦ β web β β web β stateless, scale freely
ββββ¬βββ ββββ¬βββ ββββ¬βββ
ββββββββ¬ββββββββββ΄βββββββ¬βββββββββββ
βΌ βΌ
ββββββββ ββββββββ
β api β β¦ β api β project-affine via consistent hash
ββββ¬ββββ ββββ¬ββββ shared PROJECTS_ROOT (NFS/EFS)
β β
βΌ βΌ
ββββββββ ββββββββ
β ws β β¦ β ws β sticky sessions per project
ββββ¬ββββ ββββ¬ββββ
β β
ββββββββββ¬ββββββββ
βΌ
ββββββββββββββββ
β Redis β shared rate-limit, sessions, presence
ββββββββββββββββ
β
βΌ
ββββββββββββββββ
β PostgreSQL β single primary + read replica
β (managed) β
ββββββββββββββββ
Enabling Redis¶
Set REDIS_URL in every replica's environment:
What flips on:
- Rate limiting is now shared (
@doable/shared/kv-storeswitches from memory to Redis). - OAuth state survives across replicas.
- Auth nonces and short-lived tokens become cluster-wide.
Sticky sessions for ws¶
Use the load balancer's session-affinity feature on the cookie that y-websocket attaches, or on the URL path. Examples:
- nginx-ingress (k8s):
- Cloudflare Load Balancer: enable session affinity by cookie.
For very large rooms, swap the in-memory Y.Doc store in services/ws for a Y-redis adapter.
Project affinity for api¶
The cleanest way is consistent hashing on the projectId URL segment. nginx example:
upstream api_pool {
hash $upstream_target consistent;
}
map $request_uri $upstream_target {
default $remote_addr;
~^/(?:projects|preview|chat)/([0-9a-f-]{36}) $1;
}
server {
location / { proxy_pass http://api_pool; }
}
This routes every request for the same project to the same API replica, so the project's Vite dev server is reused.
Shared filesystem¶
Mount the same directory at PROJECTS_ROOT on every API replica:
- EFS / FSx for Lustre (AWS)
- Filestore / Cloud NFS (GCP)
- NFSv4 (self-hosted, simple)
- OpenEBS / Longhorn (k8s, block storage with multi-attach)
Avoid object storage (S3) for project files β Vite needs a real POSIX filesystem.
PostgreSQL¶
- Vertical: a 4 vCPU / 16 GB Postgres comfortably handles tens of thousands of registered users.
- Horizontal: a managed read replica + the
DATABASE_REPLICA_URL(you can wire one in by extendingpackages/db/src/index.ts) for analytics queries. - Connection pool: keep
DATABASE_POOL_SIZE Γ replicas β€ pg_max_connections.
AI workers¶
If a single API replica's DoCorePool becomes the bottleneck:
- Run a dedicated Copilot CLI server (
copilot --server) on its own host and point all API replicas at it viaCOPILOT_CLI_URL. - Or run multiple
DoCoreServerinstances (packages/docore/src/docore-server.ts) and load-balance among them.
Observability¶
Wire the Tracer from @doable/docore to your OTel collector:
Hono ships with timing middleware enabled β request times are in the Server-Timing response header.
When NOT to scale¶
If your active-user count is < 100 and your peak concurrent AI sessions < 20, do not add Redis, replicas, or shared filesystems. The single-VPS deployment is dramatically simpler to operate.