Files
ajet-chat/docs/prd/infrastructure.md
2026-02-17 17:30:45 -05:00

657 lines
18 KiB
Markdown

# PRD: Infrastructure & Deployment
**Scope:** Docker Compose, NATS, MinIO, PostgreSQL, nginx, service topology
**Status:** v1 | **Last updated:** 2026-02-17
---
## 1. Overview
Infrastructure configuration for dev, test, and production environments. All environments use the same backing services (PostgreSQL, NATS, MinIO) — only configuration differs.
## 2. Service Ports
### 2.1 Application Services (Dev)
| Service | Port | Protocol |
|---------|------|----------|
| Auth Gateway | 3000 | HTTP |
| API | 3001 | HTTP |
| Web SM | 3002 | HTTP |
| TUI SM | 3003 | HTTP |
### 2.2 Infrastructure Services (Dev)
| Service | Port(s) | Protocol |
|---------|---------|----------|
| PostgreSQL | 5432 | TCP |
| NATS | 4222 (client), 8222 (monitoring) | TCP, HTTP |
| MinIO | 9000 (API), 9001 (console) | HTTP |
### 2.3 Production
| Service | Port | Exposed |
|---------|------|---------|
| nginx | 80, 443 | External |
| Auth Gateway | 3000 | Internal only |
| API | 3001 | Internal only |
| Web SM | 3002 | Internal only |
| TUI SM | 3003 | Internal only |
| PostgreSQL | 5432 | Internal only |
| NATS | 4222 | Internal only |
| MinIO | 9000 | Internal only |
---
## 3. Docker Compose — Development
**File:** `docker-compose.dev.yml`
Runs infrastructure services only. Clojure services run locally via REPL.
```yaml
# docker-compose.dev.yml
services:
postgres:
image: postgres:16-alpine
ports:
- "5432:5432"
environment:
POSTGRES_DB: ajet_chat
POSTGRES_USER: ajet
POSTGRES_PASSWORD: ajet_dev
volumes:
- pgdata_dev:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ajet -d ajet_chat"]
interval: 5s
timeout: 3s
retries: 5
nats:
image: nats:2.10-alpine
ports:
- "4222:4222"
- "8222:8222"
command: ["--js", "--sd", "/data", "-m", "8222"]
volumes:
- natsdata_dev:/data
healthcheck:
test: ["CMD", "nats-server", "--signal", "ldm"]
interval: 5s
timeout: 3s
retries: 5
minio:
image: minio/minio:latest
ports:
- "9000:9000"
- "9001:9001"
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
command: server /data --console-address ":9001"
volumes:
- miniodata_dev:/data
healthcheck:
test: ["CMD", "mc", "ready", "local"]
interval: 5s
timeout: 3s
retries: 5
minio-init:
image: minio/mc:latest
depends_on:
minio:
condition: service_healthy
entrypoint: >
/bin/sh -c "
mc alias set local http://minio:9000 minioadmin minioadmin;
mc mb --ignore-existing local/ajet-chat;
"
volumes:
pgdata_dev:
natsdata_dev:
miniodata_dev:
```
**Usage:**
```bash
docker compose -f docker-compose.dev.yml up -d
# Then start Clojure services via REPL
clj -A:dev:api:web-sm:tui-sm:auth-gw
```
---
## 4. Docker Compose — Test
**File:** `docker-compose.test.yml`
Fresh database per run. Separate ports to avoid conflicts with dev.
```yaml
# docker-compose.test.yml
services:
postgres-test:
image: postgres:16-alpine
ports:
- "5433:5432"
environment:
POSTGRES_DB: ajet_chat_test
POSTGRES_USER: ajet
POSTGRES_PASSWORD: ajet_test
tmpfs:
- /var/lib/postgresql/data # ephemeral — fresh DB each run
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ajet -d ajet_chat_test"]
interval: 3s
timeout: 2s
retries: 10
nats-test:
image: nats:2.10-alpine
ports:
- "4223:4222"
command: ["--js"] # JetStream enabled, no persistent storage
healthcheck:
test: ["CMD", "nats-server", "--signal", "ldm"]
interval: 3s
timeout: 2s
retries: 10
minio-test:
image: minio/minio:latest
ports:
- "9002:9000"
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
command: server /data
tmpfs:
- /data # ephemeral — no persistent files
healthcheck:
test: ["CMD", "mc", "ready", "local"]
interval: 3s
timeout: 2s
retries: 10
minio-test-init:
image: minio/mc:latest
depends_on:
minio-test:
condition: service_healthy
entrypoint: >
/bin/sh -c "
mc alias set local http://minio-test:9000 minioadmin minioadmin;
mc mb --ignore-existing local/ajet-chat;
"
# E2E profile — adds application service containers
auth-gw:
profiles: ["e2e"]
build:
context: .
dockerfile: auth-gw/Dockerfile
ports:
- "3100:3000"
environment:
AJET_DB_HOST: postgres-test
AJET_DB_PORT: 5432
AJET_DB_DBNAME: ajet_chat_test
AJET_DB_USER: ajet
AJET_DB_PASSWORD: ajet_test
AJET_SERVICES_API_HOST: api
AJET_SERVICES_WEB_SM_HOST: web-sm
AJET_SERVICES_TUI_SM_HOST: tui-sm
depends_on:
postgres-test:
condition: service_healthy
api:
profiles: ["e2e"]
build:
context: .
dockerfile: api/Dockerfile
environment:
AJET_DB_HOST: postgres-test
AJET_DB_PORT: 5432
AJET_DB_DBNAME: ajet_chat_test
AJET_DB_USER: ajet
AJET_DB_PASSWORD: ajet_test
AJET_NATS_URL: nats://nats-test:4222
AJET_MINIO_ENDPOINT: http://minio-test:9000
depends_on:
postgres-test:
condition: service_healthy
nats-test:
condition: service_healthy
minio-test-init:
condition: service_completed_successfully
web-sm:
profiles: ["e2e"]
build:
context: .
dockerfile: web-sm/Dockerfile
environment:
AJET_API_BASE_URL: http://api:3001
AJET_NATS_URL: nats://nats-test:4222
depends_on:
nats-test:
condition: service_healthy
tui-sm:
profiles: ["e2e"]
build:
context: .
dockerfile: tui-sm/Dockerfile
environment:
AJET_API_BASE_URL: http://api:3001
AJET_NATS_URL: nats://nats-test:4222
depends_on:
nats-test:
condition: service_healthy
```
**Usage:**
```bash
# Unit + integration tests
docker compose -f docker-compose.test.yml up -d
clj -M:test/unit
clj -M:test/integration
# E2E tests (full stack)
docker compose -f docker-compose.test.yml --profile e2e up -d --build
clj -M:test/e2e
# Teardown
docker compose -f docker-compose.test.yml --profile e2e down -v
```
---
## 5. Docker Compose — Production
**File:** `docker-compose.yml`
Full stack with nginx TLS termination.
```yaml
# docker-compose.yml
services:
nginx:
image: nginx:1.27-alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./nginx/certs:/etc/nginx/certs:ro
depends_on:
- auth-gw
restart: unless-stopped
auth-gw:
build:
context: .
dockerfile: auth-gw/Dockerfile
environment:
AJET_DB_HOST: postgres
AJET_DB_PASSWORD: ${AJET_DB_PASSWORD}
AJET_OAUTH_GITHUB_CLIENT_ID: ${GITHUB_CLIENT_ID}
AJET_OAUTH_GITHUB_CLIENT_SECRET: ${GITHUB_CLIENT_SECRET}
AJET_SERVICES_API_HOST: api
AJET_SERVICES_WEB_SM_HOST: web-sm
AJET_SERVICES_TUI_SM_HOST: tui-sm
depends_on:
postgres:
condition: service_healthy
restart: unless-stopped
api:
build:
context: .
dockerfile: api/Dockerfile
environment:
AJET_DB_HOST: postgres
AJET_DB_PASSWORD: ${AJET_DB_PASSWORD}
AJET_NATS_URL: nats://nats:4222
AJET_MINIO_ENDPOINT: http://minio:9000
AJET_MINIO_ACCESS_KEY: ${MINIO_ACCESS_KEY}
AJET_MINIO_SECRET_KEY: ${MINIO_SECRET_KEY}
depends_on:
postgres:
condition: service_healthy
nats:
condition: service_healthy
minio:
condition: service_healthy
restart: unless-stopped
web-sm:
build:
context: .
dockerfile: web-sm/Dockerfile
environment:
AJET_API_BASE_URL: http://api:3001
AJET_NATS_URL: nats://nats:4222
depends_on:
nats:
condition: service_healthy
restart: unless-stopped
tui-sm:
build:
context: .
dockerfile: tui-sm/Dockerfile
environment:
AJET_API_BASE_URL: http://api:3001
AJET_NATS_URL: nats://nats:4222
depends_on:
nats:
condition: service_healthy
restart: unless-stopped
postgres:
image: postgres:16-alpine
environment:
POSTGRES_DB: ajet_chat
POSTGRES_USER: ajet
POSTGRES_PASSWORD: ${AJET_DB_PASSWORD}
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ajet -d ajet_chat"]
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped
nats:
image: nats:2.10-alpine
command: ["--js", "--sd", "/data", "-m", "8222"]
volumes:
- natsdata:/data
healthcheck:
test: ["CMD", "nats-server", "--signal", "ldm"]
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped
minio:
image: minio/minio:latest
environment:
MINIO_ROOT_USER: ${MINIO_ACCESS_KEY}
MINIO_ROOT_PASSWORD: ${MINIO_SECRET_KEY}
command: server /data
volumes:
- miniodata:/data
healthcheck:
test: ["CMD", "mc", "ready", "local"]
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped
volumes:
pgdata:
natsdata:
miniodata:
```
**Environment file (`.env`):**
```
AJET_DB_PASSWORD=<strong-random-password>
GITHUB_CLIENT_ID=<from-github-oauth-app> # seed only — migrated to DB on first start
GITHUB_CLIENT_SECRET=<from-github-oauth-app> # seed only — migrated to DB on first start
MINIO_ACCESS_KEY=<random-access-key>
MINIO_SECRET_KEY=<random-secret-key>
```
**Note:** OAuth credentials in `.env` are auto-migrated to the `oauth_providers` DB table on first startup. After that, manage OAuth providers via the admin setup wizard or the admin API. You can also skip these env vars entirely and configure providers through the setup wizard on first deployment.
---
## 6. nginx Configuration (Production)
**File:** `nginx/nginx.conf`
```nginx
worker_processes auto;
events {
worker_connections 4096;
}
http {
upstream auth_gw {
server auth-gw:3000;
}
# Redirect HTTP → HTTPS
server {
listen 80;
server_name chat.example.com;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl http2;
server_name chat.example.com;
ssl_certificate /etc/nginx/certs/fullchain.pem;
ssl_certificate_key /etc/nginx/certs/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
# Proxy all traffic to Auth Gateway
location / {
proxy_pass http://auth_gw;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# SSE support — disable buffering
proxy_buffering off;
proxy_cache off;
proxy_read_timeout 86400s; # 24h for SSE connections
proxy_send_timeout 86400s;
# WebSocket support (future, for voice)
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
# Static file size limit
client_max_body_size 10m;
}
}
```
---
## 7. NATS JetStream Configuration
### 7.1 Stream Setup
Created automatically by the API service on startup:
```clojure
;; Stream: ajet-events
{:name "ajet-events"
:subjects ["chat.events.>"
"chat.dm.>"
"chat.typing.>"
"chat.presence.>"
"chat.notifications.>"
"chat.audit"]
:retention :limits ;; retain by limits (not interest or work-queue)
:max-age 86400000000000 ;; 24 hours in nanoseconds
:max-bytes 1073741824 ;; 1 GB
:storage :file
:replicas 1 ;; single node for v1
:discard :old} ;; discard oldest when limits hit
```
### 7.2 Consumer Setup
Each session manager creates a durable consumer on connect:
```clojure
;; Consumer: per-user, created by SM on SSE connection
{:durable-name "sm-{service}-{user-id}" ;; e.g. "sm-web-abc123"
:filter-subjects ["chat.events.{community-id}"
"chat.dm.{channel-id-1}"
"chat.dm.{channel-id-2}"
"chat.notifications.{user-id}"]
:deliver-policy :by-start-time ;; on reconnect: from last-event-id time
:ack-policy :none} ;; no ack needed for real-time delivery
```
### 7.3 Subject ACLs (Future)
Not needed for v1 (single NATS instance, trusted services). For multi-tenant production:
- API: publish to all subjects
- SMs: subscribe only to subjects relevant to their connected users
- No direct client access to NATS (all mediated by SMs)
---
## 8. MinIO Configuration
### 8.1 Bucket Setup
Single bucket `ajet-chat` with:
- Default retention: none (files kept indefinitely)
- Versioning: disabled (no need for file history in v1)
### 8.2 Storage Key Convention
```
attachments/{message-uuid}/{filename} — message attachments
avatars/users/{user-uuid}/{filename} — user avatars
avatars/communities/{community-uuid}/{filename} — community icons
avatars/webhooks/{webhook-uuid}/{filename} — webhook bot icons
```
### 8.3 Access Patterns
- **Upload:** API service writes to MinIO on file upload
- **Download:** Auth GW proxies `/files/*` requests to MinIO (or API generates presigned URLs)
- **No direct client access** to MinIO in v1 — all through API/Auth GW
---
## 9. PostgreSQL Configuration
### 9.1 Recommended Settings (Production)
```
shared_buffers = 256MB # 25% of available RAM
work_mem = 16MB
effective_cache_size = 768MB # 75% of available RAM
maintenance_work_mem = 128MB
max_connections = 100
log_min_duration_statement = 500 # log queries > 500ms
```
### 9.2 Backups
- **pg_dump** daily cron job (simple, sufficient for v1)
- Dump to local volume, rotate 7 days
- Future: WAL archiving for point-in-time recovery
---
## 10. Dockerfile Template
Each service uses the same multi-stage build:
```dockerfile
# {service}/Dockerfile
FROM clojure:temurin-21-tools-deps-1.12 AS builder
WORKDIR /app
COPY deps.edn .
COPY shared/ shared/
COPY {service}/ {service}/
RUN clj -T:build uber :module {service}
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
COPY --from=builder /app/target/{service}.jar app.jar
EXPOSE {port}
CMD ["java", "-jar", "app.jar"]
```
**Build alias** (root `deps.edn`):
```clojure
:build {:deps {io.github.clojure/tools.build {:mvn/version "0.10.6"}}
:ns-default build}
```
---
## 11. Environment Variables Reference
### 11.1 All Services
| Variable | Default | Description |
|----------|---------|-------------|
| `AJET_DB_HOST` | `localhost` | PostgreSQL host |
| `AJET_DB_PORT` | `5432` | PostgreSQL port |
| `AJET_DB_DBNAME` | `ajet_chat` | Database name |
| `AJET_DB_USER` | `ajet` | Database user |
| `AJET_DB_PASSWORD` | — | Database password (required) |
| `AJET_NATS_URL` | `nats://localhost:4222` | NATS server URL |
| `AJET_MINIO_ENDPOINT` | `http://localhost:9000` | MinIO endpoint |
| `AJET_MINIO_ACCESS_KEY` | `minioadmin` | MinIO access key |
| `AJET_MINIO_SECRET_KEY` | `minioadmin` | MinIO secret key |
### 11.2 Auth Gateway Only
| Variable | Default | Description |
|----------|---------|-------------|
| `AJET_OAUTH_GITHUB_CLIENT_ID` | — | GitHub OAuth app client ID (seed only — auto-migrated to DB) |
| `AJET_OAUTH_GITHUB_CLIENT_SECRET` | — | GitHub OAuth app client secret (seed only — auto-migrated to DB) |
| `AJET_OAUTH_GITEA_CLIENT_ID` | — | Gitea OAuth client ID (seed only — auto-migrated to DB) |
| `AJET_OAUTH_GITEA_CLIENT_SECRET` | — | Gitea OAuth client secret (seed only — auto-migrated to DB) |
| `AJET_OAUTH_GITEA_BASE_URL` | — | Gitea instance URL (seed only — auto-migrated to DB) |
| `AJET_OAUTH_OIDC_CLIENT_ID` | — | OIDC client ID (seed only — auto-migrated to DB) |
| `AJET_OAUTH_OIDC_CLIENT_SECRET` | — | OIDC client secret (seed only — auto-migrated to DB) |
| `AJET_OAUTH_OIDC_ISSUER_URL` | — | OIDC issuer URL (seed only — auto-migrated to DB) |
| `AJET_SERVICES_API_HOST` | `localhost` | API service host |
| `AJET_SERVICES_API_PORT` | `3001` | API service port |
| `AJET_SERVICES_WEB_SM_HOST` | `localhost` | Web SM host |
| `AJET_SERVICES_WEB_SM_PORT` | `3002` | Web SM port |
| `AJET_SERVICES_TUI_SM_HOST` | `localhost` | TUI SM host |
| `AJET_SERVICES_TUI_SM_PORT` | `3003` | TUI SM port |
**Note on OAuth env vars:** The `AJET_OAUTH_*` environment variables serve as a seed for initial deployment only. On first startup, if the `oauth_providers` DB table is empty, Auth GW auto-migrates these values into the table. After migration, providers are managed exclusively via the admin API (`/api/admin/oauth-providers`) or the setup wizard. The env vars are ignored once providers exist in the DB.
### 11.3 Session Managers Only
| Variable | Default | Description |
|----------|---------|-------------|
| `AJET_API_BASE_URL` | `http://localhost:3001` | Internal API base URL |
---
## 12. Test Cases
| ID | Test | Type | Description |
|----|------|------|-------------|
| INFRA-T1 | Dev compose starts | Integration | `docker compose -f docker-compose.dev.yml up` starts PG + NATS + MinIO |
| INFRA-T2 | PG accessible | Integration | Can connect to Postgres on port 5432 |
| INFRA-T3 | NATS accessible | Integration | Can connect to NATS on port 4222 |
| INFRA-T4 | MinIO accessible | Integration | Can connect to MinIO on port 9000 |
| INFRA-T5 | MinIO bucket created | Integration | `ajet-chat` bucket exists after `minio-init` |
| INFRA-T6 | Test compose fresh DB | Integration | Test DB has no data from previous runs |
| INFRA-T7 | JetStream enabled | Integration | NATS JetStream API responds to stream info request |
| INFRA-T8 | Health checks pass | Integration | All service health checks return healthy |
| INFRA-T9 | Prod compose full stack | E2E | All services start and connect to each other |
| INFRA-T10 | nginx proxies to auth-gw | E2E | HTTPS request reaches Auth Gateway via nginx |