Files
2026-02-17 17:30:45 -05:00

438 lines
18 KiB
Markdown

# PRD: Auth Gateway
**Module:** `auth-gw/` | **Namespace:** `ajet.chat.auth-gw.*`
**Status:** v1 | **Last updated:** 2026-02-17
---
## 1. Overview
The Auth Gateway is the single edge entry point for all client traffic. It terminates sessions, validates tokens, and reverse-proxies authenticated requests to internal services (API, Web SM, TUI SM). It also handles OAuth login flows and session management.
## 2. Architecture
```
Client → (nginx TLS, prod) → Auth Gateway → API Service
→ Web Session Manager
→ TUI Session Manager
```
**Auth GW has direct PG access** for session/token table lookups — this avoids a round-trip to the API for every request.
## 3. Route Table
| Path Pattern | Target | Auth Required | Description |
|--------------|--------|---------------|-------------|
| `GET /` | Web SM | Session | Web app root |
| `GET /app/*` | Web SM | Session | Web app pages |
| `GET /sse/*` | Web SM | Session | SSE streams for web |
| `POST /web/*` | Web SM | Session | Web form submissions / Datastar signals |
| `GET,POST /api/*` | API | Session or API Token | REST API |
| `GET /tui/sse/*` | TUI SM | Session | SSE streams for TUI clients |
| `POST /tui/*` | TUI SM | Session | TUI client signals |
| `POST /api/webhooks/*/incoming` | API | Webhook Token | Incoming webhooks (bypass session auth) |
| `GET /auth/login` | Self | None | Login page (OAuth provider buttons) |
| `GET /auth/callback/:provider` | Self | None | OAuth callback |
| `POST /auth/logout` | Self | Session | Logout (destroy session) |
| `GET /setup/providers` | Self | None | Setup wizard step 1: configure OAuth providers |
| `POST /setup/providers` | Self | None | Setup wizard step 1: save OAuth provider config |
| `GET /setup/community` | Self | Session | Setup wizard step 3: create first community |
| `POST /setup/community` | Self | Session | Setup wizard step 3: submit community creation |
| `GET /invite/:code` | Self | None | Invite landing page → redirect to login if needed |
| `GET /health` | Self | None | Health check |
## 4. Authentication Flows
### 4.1 Session Token Validation
Every authenticated request follows this flow:
```
1. Extract token from Cookie: ajet_session=<base64url-token>
2. bcrypt-verify token against sessions.token_hash
3. Check sessions.expires_at > now
4. If valid:
a. Extend session TTL (rolling expiry) — async, don't block request
b. Inject headers: X-User-Id, X-User-Role, X-Community-Id, X-Trace-Id
c. Proxy to target service
5. If invalid/expired: redirect to /auth/login (web) or 401 (API/TUI)
```
**Rolling expiry:** Each valid request extends `expires_at` by the session TTL (default: 30 days). This is done asynchronously to avoid adding latency.
**Token format:** 32 random bytes, base64url-encoded (43 characters). Stored as bcrypt hash.
### 4.2 API Token Validation
For `Authorization: Bearer <token>` requests to `/api/*`:
```
1. Extract token from Authorization header
2. bcrypt-verify against api_tokens.token_hash
3. Check api_tokens.expires_at > now (if set)
4. Check scopes allow the requested operation
5. If valid: inject X-User-Id (api_user's owner), X-Trace-Id
6. If invalid: 401
```
### 4.3 OAuth Login Flow
**Supported providers:** GitHub, Gitea, Generic OIDC
**Provider storage:** OAuth providers are stored in the `oauth_providers` DB table and are configurable at runtime via admin endpoints (`/api/admin/oauth-providers`). On first startup, if the `oauth_providers` table is empty, any providers configured via env vars (`:oauth` config) are auto-migrated to the DB.
```
1. User visits /auth/login
2. Auth GW loads enabled providers from oauth_providers table
3. Page shows provider buttons for each enabled provider
4. User clicks provider → redirect to provider's authorize URL
5. Provider redirects to /auth/callback/:provider with code
6. Auth GW exchanges code for access token
7. Auth GW fetches user profile from provider
8. Look up oauth_accounts by (provider, provider_user_id):
a. EXISTS: load user, create session
b. NOT EXISTS: create user + oauth_account, create session
9. Set session cookie, redirect to / (or to pending invite if present)
```
**OAuth config shape (fallback — auto-migrated to DB on first startup):**
```clojure
{:oauth
{:github {:client-id "..." :client-secret "..." :enabled true}
:gitea {:client-id "..." :client-secret "..." :base-url "https://gitea.example.com" :enabled true}
:oidc {:client-id "..." :client-secret "..." :issuer-url "https://auth.example.com" :enabled false}}}
```
After migration, provider config is read exclusively from the DB. The `:oauth` config key serves only as a seed for initial deployment and is ignored once providers exist in the DB.
**Generic OIDC:** Uses `.well-known/openid-configuration` discovery. Requires `openid`, `profile`, `email` scopes.
### 4.4 Admin Setup Wizard (First-Deployment Bootstrap)
The setup wizard is a multi-step flow handled by Auth GW for first-time deployment. It activates when the `system_settings` table indicates setup is incomplete (no `setup_completed` flag).
```
Step 1 — Configure OAuth Providers (no auth required):
1. User hits any route on a fresh deployment
2. Auth GW checks system_settings: setup_completed?
- Not completed: redirect to /setup/providers
- Completed: normal login flow
3. /setup/providers shows a form to configure at least one OAuth provider
(provider type, client ID, client secret, base URL / issuer URL)
4. Admin submits provider config → saved to oauth_providers table
5. Auth GW redirects to /auth/login with the newly configured provider(s)
Step 2 — Admin Login via OAuth:
6. Admin logs in via one of the just-configured OAuth providers
7. First user is created with admin/owner privileges
Step 3 — Create First Community:
8. After login, redirect to /setup/community (rendered by Auth GW, not Web SM)
9. Admin enters community name, slug auto-generated
10. POST creates community (user becomes owner, #general created)
11. system_settings.setup_completed = true
12. Redirect to /app
```
Subsequent community creation (by already-authenticated users) uses the Web SM `/setup` page.
### 4.5 Invite Flow
```
1. User visits /invite/:code
2. Auth GW checks invite validity (exists, not expired, not exhausted)
- Invalid: show error page
- Valid: store invite code in cookie/session, redirect to /auth/login
3. After OAuth login, if pending invite code:
a. Accept invite (join community)
b. Redirect to community
```
## 5. Reverse Proxy Behavior
**Request forwarding:**
- Strip auth headers from original request
- Inject: `X-User-Id`, `X-User-Role`, `X-Community-Id`, `X-Trace-Id`, `X-Forwarded-For`
- Forward request body, method, path, query string unchanged
- SSE: hold connection open, stream response bytes through
**Response forwarding:**
- Pass through status code, headers, body unchanged
- For SSE responses: stream chunks as they arrive (no buffering)
**Service discovery (v1):** Static config — all services on localhost with configured ports.
```clojure
{:services
{:api {:host "localhost" :port 3001}
:web-sm {:host "localhost" :port 3002}
:tui-sm {:host "localhost" :port 3003}}}
```
## 6. Rate Limiting
| Endpoint Pattern | Limit | Window |
|-----------------|-------|--------|
| `POST /auth/login` | 10 | 1 min per IP |
| `POST /auth/callback/*` | 10 | 1 min per IP |
| `POST /api/*` | 60 | 1 min per user |
| `GET /api/*` | 120 | 1 min per user |
| `POST /api/webhooks/*/incoming` | 30 | 1 min per webhook |
| `GET /sse/*`, `GET /tui/sse/*` | 5 | 1 min per user (connection attempts) |
**Implementation:** In-memory token bucket (atom-based). No Redis needed for v1 (single instance).
## 7. Session Cookie
```
Name: ajet_session
Value: <base64url-encoded-token>
Attributes:
HttpOnly: true
Secure: true (prod only)
SameSite: Lax
Path: /
Max-Age: 2592000 (30 days)
```
## 8. Test Cases
### 8.1 Session Validation
| ID | Test | Description |
|----|------|-------------|
| AUTH-T1 | Valid session cookie | Request proxied with injected headers |
| AUTH-T2 | Expired session | Returns 401 (API) or redirect to login (web) |
| AUTH-T3 | Invalid/tampered token | Returns 401 |
| AUTH-T4 | Missing cookie | Returns 401 (API) or redirect to login (web) |
| AUTH-T5 | Session TTL extension | Valid request extends expires_at |
| AUTH-T6 | Concurrent requests | Multiple requests with same session all succeed |
### 8.2 API Token Validation
| ID | Test | Description |
|----|------|-------------|
| AUTH-T7 | Valid API token | Request proxied with X-User-Id |
| AUTH-T8 | Expired API token | Returns 401 |
| AUTH-T9 | Invalid scope | Returns 403 (scope mismatch) |
| AUTH-T10 | Bearer header format | Correctly parses `Bearer <token>` |
### 8.3 OAuth Flow
| ID | Test | Description |
|----|------|-------------|
| AUTH-T11 | GitHub OAuth success | Code exchanged, user created/found, session set, redirected |
| AUTH-T12 | Gitea OAuth success | Same as above for Gitea |
| AUTH-T13 | OIDC OAuth success | Uses discovery document, same flow |
| AUTH-T14 | OAuth invalid code | Returns error, redirects to login with error message |
| AUTH-T15 | OAuth provider down | Returns 502 with friendly error |
| AUTH-T16 | Existing user re-login | Finds existing user via oauth_accounts, creates new session |
| AUTH-T17 | New user first login | Creates user + oauth_account + session |
| AUTH-T18 | OAuth state parameter | CSRF protection via state param validated on callback |
### 8.4 Admin Setup Wizard
| ID | Test | Description |
|----|------|-------------|
| AUTH-T19 | Fresh deploy redirects to setup | Any route with setup_completed=false redirects to /setup/providers |
| AUTH-T20 | Step 1: configure OAuth provider | POST /setup/providers saves provider to oauth_providers table |
| AUTH-T21 | Step 1: requires at least one provider | POST /setup/providers with empty config returns 422 |
| AUTH-T22a | Step 2: login via configured provider | After provider setup, /auth/login shows newly configured provider |
| AUTH-T22b | Step 3: first user becomes owner | After OAuth + community creation, user has owner role |
| AUTH-T22c | Setup completed flag set | After community creation, system_settings.setup_completed = true |
| AUTH-T22d | Subsequent users see normal login | With setup_completed=true, normal login page shown |
| AUTH-T22e | Setup routes blocked after completion | /setup/providers returns 403 when setup_completed=true |
### 8.5 Invite Flow
| ID | Test | Description |
|----|------|-------------|
| AUTH-T22 | Valid invite → login → join | Full invite acceptance flow works |
| AUTH-T23 | Expired invite | Shows error page |
| AUTH-T24 | Exhausted invite | Shows error page |
| AUTH-T25 | Already-member invite | Accepts gracefully, redirects to community |
### 8.7 Reverse Proxy
| ID | Test | Description |
|----|------|-------------|
| AUTH-T26 | API route proxied | /api/channels → forwarded to API service |
| AUTH-T27 | Web route proxied | / → forwarded to Web SM |
| AUTH-T28 | TUI route proxied | /tui/sse → forwarded to TUI SM |
| AUTH-T29 | SSE streaming | SSE response streamed without buffering |
| AUTH-T30 | Target service down | Returns 502 |
| AUTH-T31 | Headers injected | X-User-Id, X-Trace-Id present on proxied request |
| AUTH-T32 | Original auth headers stripped | Client cannot forge X-User-Id |
### 8.8 Rate Limiting
| ID | Test | Description |
|----|------|-------------|
| AUTH-T33 | Under limit | Requests succeed normally |
| AUTH-T34 | Over limit | Returns 429 with Retry-After header |
| AUTH-T35 | Rate limit per-user | Different users have independent limits |
| AUTH-T36 | Rate limit per-IP for auth | OAuth callback rate limited by IP |
### 8.9 Logout
| ID | Test | Description |
|----|------|-------------|
| AUTH-T37 | Logout destroys session | POST /auth/logout deletes session from DB, clears cookie |
| AUTH-T38 | Logout with invalid session | Returns 200 (idempotent), clears cookie |
### 8.10 Health Check
| ID | Test | Description |
|----|------|-------------|
| AUTH-T39 | Health check | GET /health returns 200 with service status |
| AUTH-T40 | Health check (DB down) | Returns 503 with degraded status |
---
## 9. Service Configuration
### 9.1 Config Shape
```clojure
{:server {:host "0.0.0.0" :port 3000}
:db {:host "localhost" :port 5432 :dbname "ajet_chat"
:user "ajet" :password "..." :pool-size 5}
:oauth {:github {:client-id "..." :client-secret "..." :enabled true}
:gitea {:client-id "..." :client-secret "..." :base-url "https://gitea.example.com" :enabled false}
:oidc {:client-id "..." :client-secret "..." :issuer-url "https://auth.example.com" :enabled false}}
;; ↑ Fallback seed only — auto-migrated to oauth_providers DB table on first startup.
;; Ignored once providers exist in DB. Manage providers via admin API after setup.
:services {:api {:host "localhost" :port 3001}
:web-sm {:host "localhost" :port 3002}
:tui-sm {:host "localhost" :port 3003}}
:session {:ttl-days 30
:cookie-name "ajet_session"
:cookie-secure true} ;; false in dev
:rate-limit {:enabled true}
:cors {:allowed-origins ["https://chat.example.com"]
:allowed-methods [:get :post :put :delete :options]
:allowed-headers ["Content-Type" "Authorization" "X-Trace-Id"]
:max-age 86400}}
```
### 9.2 CORS Configuration
CORS headers applied to all responses:
```
Access-Control-Allow-Origin: <configured origin or request Origin if in allowed list>
Access-Control-Allow-Methods: GET, POST, PUT, DELETE, OPTIONS
Access-Control-Allow-Headers: Content-Type, Authorization, X-Trace-Id
Access-Control-Allow-Credentials: true
Access-Control-Max-Age: 86400
```
- **Dev mode:** Allow `http://localhost:*` origins
- **Prod mode:** Strict origin whitelist from config
- **Preflight requests:** `OPTIONS` handled and returned immediately (no proxy)
### 9.3 Audit Logging
**What's logged:**
- All admin actions: kick, ban, mute, role change, channel delete, webhook create/delete, invite create/revoke
- Authentication events: login success, login failure, logout, session expiry
- Rate limit violations
**Audit log table (in PostgreSQL, written by Auth GW):**
```sql
audit_log (
id uuid PK,
actor_id uuid FKusers NULL, -- NULL for unauthenticated events
action text, -- 'login', 'kick', 'ban', 'channel.delete', etc.
target_type text NULL, -- 'user', 'channel', 'community', etc.
target_id uuid NULL,
community_id uuid NULL,
ip_address inet,
metadata jsonb, -- extra context (reason, duration, etc.)
created_at timestamptz
)
idx_audit_log_actor ON audit_log(actor_id, created_at)
idx_audit_log_community ON audit_log(community_id, created_at)
```
**Note:** Auth GW writes audit logs directly to PG (it has DB access). API sends audit-worthy events to Auth GW via NATS subject `chat.audit` — Auth GW subscribes and persists them.
### 9.4 Middleware Pipeline
```
1. CORS headers (preflight short-circuit)
2. Request ID generation (X-Trace-Id if not present)
3. Request logging
4. Rate limiting (per-IP for auth, per-user for API)
5. Route matching
6. Auth endpoints → handle directly (OAuth, login page, logout)
7. Health check → handle directly
8. Webhook endpoints → webhook token validation → proxy to API
9. All other → session/token validation → header injection → proxy to target
10. Response logging (status, duration)
```
### 9.5 Startup / Shutdown Sequence
**Startup:**
```
1. Load config (EDN + env vars)
2. Create DB connection pool (HikariCP)
3. Auto-migrate OAuth providers from :oauth config to oauth_providers table (if table is empty)
4. Load enabled OAuth providers from DB
5. Initialize rate limiter (in-memory atom)
6. Start http-kit server
7. Log "Auth Gateway started on port {port}"
```
**Shutdown (graceful):**
```
1. Stop accepting new connections
2. Wait for in-flight requests (max 30s)
3. Close DB connection pool
4. Log "Auth Gateway stopped"
```
---
## 10. Login Page UI Mock (Hiccup rendered by Auth GW)
```
┌──────────────────────────────────────┐
│ │
│ ┌──────────────┐ │
│ │ ajet chat │ │
│ └──────────────┘ │
│ │
│ Sign in to continue │
│ │
│ ┌──────────────────────────┐ │
│ │ ◉ Continue with GitHub │ │ ← providers loaded from DB
│ └──────────────────────────┘ │
│ ┌──────────────────────────┐ │
│ │ ◉ Continue with Gitea │ │
│ └──────────────────────────┘ │
│ ┌──────────────────────────┐ │
│ │ ◉ Continue with SSO │ │ ← only if OIDC provider in DB
│ └──────────────────────────┘ │
│ │
│ ─── or accepting invite ─── │ ← only if invite code present
│ Joining: My Team │
│ │
└──────────────────────────────────────┘
```
---
## 11. Error Pages
Auth GW renders simple HTML error pages for:
| Status | Page | When |
|--------|------|------|
| 401 | Unauthorized | Invalid/expired session (web requests redirect to `/auth/login` instead) |
| 403 | Forbidden | Valid session but insufficient permission |
| 404 | Not Found | Unknown route |
| 429 | Rate Limited | Too many requests — shows retry countdown |
| 502 | Bad Gateway | Target service unreachable |
| 503 | Service Unavailable | Auth GW degraded (DB down) |