Skip to content
Docs just relaunched - explore the new sidebar, OG images, and AI-ready content.
Operate And Ship

Security

Defaults, abuse protection, and recommended posture.

Last updated on

12 min read

SyntaxKit ships defense in depth: every request crosses five layers before it touches a database row: edge headers and CORS, Better Auth, the oRPC middleware chain, the Upstash abuse policy, then a Zod-validated handler. Most layers are auto-on; only Upstash (UPSTASH_REDIS_REST_URL, UPSTASH_REDIS_REST_TOKEN) and Cloudflare Turnstile (TURNSTILE_SECRET_KEY, NEXT_PUBLIC_TURNSTILE_SITE_KEY) need explicit config before they activate. The Pre-Launch Checklist lists every secret to set first.

Defense In Depth

Five-layer request path: edge proxy, Better Auth, oRPC middleware, Upstash abuse policy, Zod-validated handler.
LayerWhere it livesAlways onDisable for dev
Edge headers + CORSapps/web/proxy.ts, apps/web/lib/security-headers.tsYesNo
Better Auth (session, rate limit, captcha)packages/auth/src/server.tsSession yes; captcha gates on env varsRate limits widen automatically under NODE_ENV=test; no opt-out flag
oRPC middleware (auth, RBAC, admin)packages/api/src/middleware/YesNo
Abuse policy (Upstash)packages/shared/src/abuse.tsRequired in production; fails closed on missing_configDISABLE_ABUSE_PROTECTION=true (non-production only); also auto-bypassed in non-production when Upstash is unconfigured
Zod input validationPer oRPC procedureYesNo

The diagram source lives at apps/docs/diagrams/security-layers.mmd. Rerun pnpm --filter @syntaxkit/docs diagrams:build after editing it to refresh both SVG variants.

Edge: Headers, CSP, CORS

The proxy at apps/web/proxy.ts runs before every route handler. It applies a Nosecone-driven security header bundle (CSP, HSTS, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, Cross-Origin-Opener-Policy, Permissions-Policy) and adds CORS to API routes. The CSP allow-list extends Nosecone's strict defaults with only the origins the kit actually talks to:

DirectiveAllowance
script-src'self', 'unsafe-inline' (Next.js bootstrap), https://challenges.cloudflare.com, https://cdn.jsdelivr.net (Scalar API reference)
style-srcNosecone defaults plus 'unsafe-inline'
img-src'self', data:, blob:, plus your S3 image origins resolved at boot
connect-src'self', Turnstile, configured S3 origins (presigned PUT targets)
frame-srchttps://challenges.cloudflare.com only (Turnstile widget)
worker-srcNosecone defaults plus blob:
upgrade-insecure-requestsEnabled in production

CORS is scoped to NEXT_PUBLIC_APP_URL and applied only to /api/*, /rpc/*, /trpc/*; other paths get no Access-Control-Allow-Origin. Credentials are allowed (Access-Control-Allow-Credentials: true) so authenticated cross-origin clients can still reach the API.

The proxy's auth gate on /dashboard/* and /api-reference/* is a cheap cookie-presence check (getSessionCookie(request)), not a full session decryption, just a fast redirect for unauthenticated traffic. The real session check happens inside Better Auth handlers and oRPC procedures (the authorized middleware). /api-reference adds another layer: its route handler (apps/web/app/api-reference/[[...rest]]/route.ts) decrypts the session, asserts the platform admin role, and 404s every other caller (including signed-in non-admins, so the route's existence isn't disclosed). It serves only the docs UI and spec.json, not a callable REST mount of the API.

Authentication: Sessions And Captcha

Better Auth issues opaque session cookies signed and encrypted by BETTER_AUTH_SECRET. httpOnly + sameSite=lax plus a session check at every request boundary mitigate CSRF without a separate token. Captcha is opt-in: Cloudflare Turnstile only attaches when both TURNSTILE_SECRET_KEY and NEXT_PUBLIC_TURNSTILE_SITE_KEY are set.

Session cookie hardening

Cookie attributes (httpOnly: true, sameSite: "lax", path: "/", secure: true in production) are pinned explicitly in packages/auth/src/server.ts via advanced.defaultCookieAttributes and advanced.useSecureCookies rather than left to Better Auth's defaults: the values match those defaults today, but writing them out keeps any future upstream change from silently weakening the kit.

trustedOrigins is set explicitly to [NEXT_PUBLIC_APP_URL, BETTER_AUTH_URL] so OAuth callbacks and password-reset returns can only land back on origins you've declared. session.cookieCache is enabled with a 5-minute TTL so the per-request "is this user signed in?" check answers from the signed cookie without a database round-trip; banning a user takes effect on the next cookie refresh.

How Turnstile is wired

When the env vars are unset, the relevant forms render in a degraded state (button disabled with messaging) rather than failing silently. The contact form has its own server-side verifyTurnstileToken helper at packages/api/src/lib/turnstile.ts so it stays gated even outside the Better Auth flow. That helper uses the shared fetchWithTimeout wrapper (5s ceiling) so a slow Cloudflare response can't hold a worker indefinitely.

See Authentication for the full session model, OAuth wiring, 2FA, and passkeys.

Built-in Auth Rate Limits

Better Auth's own rate limiter applies route-specific rules on top of a default ceiling, configured in packages/auth/src/server.ts. These are built-in (no Upstash needed) and apply per IP at the auth layer, complementing the per-surface Upstash policy below (which applies per email/userId/orgId at the application layer).

RouteLimitWindow
default (every other auth endpoint)10060s
/sign-in/email510s
/sign-up/email560s
/request-password-reset360s
/reset-password560s
/two-factor/verify-totp310s
/two-factor/verify-backup-code310s

These limits widen automatically under NODE_ENV=test (Vitest and Playwright). There is no env var to disable them: production cannot enter the test runtime, so the credential-stuffing, reset-spam, and TOTP brute-force defenses can't be neutered by a leaky deploy template.

API: The Middleware Chain

Every oRPC procedure picks a base from the middleware chain in packages/api/src/middleware/. Five layers cover every common gating pattern:

MiddlewareWhat it assertsWhen to use
authorizedBetter Auth session present; adds session and user to context. Throws UNAUTHORIZED otherwise.Default base for any signed-in procedure.
withActiveOrganizationActive org resolved and added to context. Throws BAD_REQUEST if no org is active.Procedures that operate on the active org implicitly.
withOrganizationAccessinput.organizationId matches the active org.Procedures that take an org id explicitly; prevents id-swap attacks.
withPermission({...})Better Auth hasPermission check passes for the requested resource and action.Org-scoped permissions like { member: ["create"] }, { invitation: ["read"] }, or { billing: ["view"] }.
requireAdminThe user has the platform admin role. Throws FORBIDDEN otherwise.Every procedure under the admin namespace.

Every procedure also chains .input(zodSchema) and .output(zodSchema). Inputs are validated before the handler runs, so malformed or oversized payloads never reach domain code. See the API page for the chain in context.

Input hardening at the API edge

ID fields go through the shared idSchema (z.string().min(1).max(64)), and passwords and TOTP codes are bounded at the API edge. The admin listUsers query pins sortBy / filterField to safe-column enums so an attacker can't sort by password or twoFactorSecret. User-supplied image URLs (organization logos, user avatars) are validated against a hostname allowlist derived from your S3 / CDN config plus NEXT_PUBLIC_IMAGE_HOST_ALLOWLIST. Bare S3 keys still pass through unchanged.

Abuse Protection (Upstash)

packages/shared/src/abuse.ts implements per-surface sliding-window rate limits backed by Upstash Redis. Each policy has 1-2 rules indexed by a characteristic (ip, email, userId, or organizationId) and runs in a pre_verification or post_verification phase. Every caller routes its AbuseDecision through resolveAbuseDecision, which converts the result into one of three outcomes: allow, rate_limited, or unavailable.

SurfaceCharacteristics
contact.submitIP (5 / 10m) plus email (3 / 1h)
auth.password_reset_emailIP (5 / 10m) plus email (3 / 1h)
auth.invitation_emailuserId (10 / 1h) plus email (5 / 1h)
auth.change_email_confirmationuserId (5 / 1h) plus email (3 / 1h)
auth.verification_emailIP (5 / 10m) plus email (3 / 1h)
chat.senduserId (20 / 10m) plus organizationId (100 / 10m)
chat.regenerateuserId (10 / 10m) plus organizationId (30 / 10m)
storage.image_presignuserId (10 / 10m)
storage.image_finalizeuserId (20 / 10m)
rpc.requestIP (200 / 1m): global ceiling on the oRPC handler, applied in apps/web/app/rpc/[[...rest]]/route.ts before per-procedure surfaces

Posture: Required In Production, Fail Closed Everywhere

The policy converged on a single posture so misconfiguring one surface can never silently disable the others.

Production: required, fail closed

assertValidSetupEnv refuses to boot without both Upstash vars, and every surface surfaces SERVICE_UNAVAILABLE when the policy can't decide.

Non-production: auto-bypass

No Upstash means resolveAbuseDecision returns allow with bypassReason dev_no_upstash and a one-time warning, so a fresh checkout just works.

Explicit override flag

DISABLE_ABUSE_PROTECTION=true forces a bypass (bypassReason explicit_flag) for deterministic CI; rejected at boot in production.

How each failure mode resolves
  • Production requires Upstash. assertValidSetupEnv refuses to start the app when NODE_ENV=production and either UPSTASH_REDIS_REST_URL or UPSTASH_REDIS_REST_TOKEN is missing. Without it, the most expensive surfaces (image uploads, AI chat, transactional email) would silently run unprotected while only the contact form noticed.
  • missing_config fails closed in production. Storage, chat, the contact form, and every Better Auth email handler surface SERVICE_UNAVAILABLE (or a thrown error inside Better Auth) when the policy can't decide. The alternative on expensive surfaces is unbounded financial damage.
  • missing_config auto-bypasses outside production. A fresh pnpm dev without Upstash would otherwise silently drop verification emails, contact submissions, chat, and uploads with zero UI signal. resolveAbuseDecision returns { kind: "allow", bypassed: true, bypassReason: "dev_no_upstash" } when Upstash is missing and NODE_ENV !== "production", and each caller logs a one-time warning. The NODE_ENV gate and boot doctor keep this branch from firing in production.
  • DISABLE_ABUSE_PROTECTION=true is the explicit override. For CI / Playwright where Upstash is provisioned but tests need deterministic outcomes. Produces bypassReason: "explicit_flag". Setting it with NODE_ENV=production is rejected at boot, mirroring the ENABLE_BETTER_AUTH_TEST_UTILS guard.
  • missing_characteristic always fails closed. It means a required characteristic (e.g. an authenticated user's id) was missing from the call site: a programming error, not a deployment one. Letting it through would defeat declaring the rule required: true.

The single shared helper lives at resolveAbuseDecision in packages/shared/src/abuse.ts. New surfaces should call enforceAbusePolicy and route the result through it instead of branching on decision.reason themselves; the helper is what keeps the posture consistent across packages.

Billing Caps: Atomic Reservation, Not Check-Then-Act

The Upstash windows above smooth bursts but don't enforce per-plan billing caps (monthlyAiResponses). For those, reserveAiUsageEvent in packages/api/src/lib/billing.ts counts the active billing window and inserts a new AiUsageEvent row inside one prisma.$transaction, opened with pg_advisory_xact_lock(hashtextextended('ai-usage:<organizationId>', 0)) so concurrent reservations for the same org serialise and the count + insert pair is race-free.

PropertyBehaviour
Lock scopePer-organization (hashtextextended('ai-usage:<orgId>', 0)); cross-org calls never contend
Lock durationThe fast count + insert pair only, never held while the gateway is producing tokens
Unlimited plansSkip the lock and the count entirely; insert directly for analytics
On model errorstreamText.onError (and a synchronous try/catch) issues a best-effort prisma.aiUsageEvent.delete to refund the reservation so a failed turn doesn't consume the user's quota
Custom capsWrap reserveAiUsageEvent (or copy the pattern) for any new billable surface; pair with assertWithinAiResponseLimit for non-mutating predicate checks (UI banners, dashboards)

assertWithinAiResponseLimit and getAiUsage stay exported for read-only soft checks; the chat router never relies on them for enforcement.

Why atomic reservation, not check-then-act

This closes a check-then-act (TOCTOU) class of bug. With a separate up-front check and a post-stream insert (the original shape of chat.send and chat.regenerate), N concurrent requests at count = limit - 1 would all pass the check, all bill the gateway, and only then increment the counter, letting free-tier organizations overshoot their cap and paid plans pay more provider cost than they should. Folding the count and insert into one advisory-locked transaction makes the reservation atomic per organization.

Webhooks: Signature Verification

The Stripe webhook handler at apps/web/app/api/webhooks/stripe/route.ts verifies every event with stripe.webhooks.constructEvent against STRIPE_WEBHOOK_SECRET. Missing or invalid signatures return a 400 before any handler runs:

const signature = req.headers.get("stripe-signature");

if (!signature) {
  log.warn("Stripe webhook missing signature header", {
    /* ... */
  });
  return new Response("Missing stripe-signature header", { status: 400 });
}

The handler then writes a StripeWebhookEvent row with status: "processing" (event-level dedupe) before delegating to per-event logic, which dedupes side effects via the OutboundEffect semantic key. Together the two layers mean a delivered-twice Stripe webhook does not double-charge, double-email, or double-grant entitlements. See Webhooks And Async Workflows for the full idempotency story.

Uploads: The Validation Boundary

Client-side checks reject the wrong size or MIME before the presign request is sent, but they are courtesy. The real security boundary is the server-side sharp re-encode during finalize: a file that decodes cleanly is unambiguously an image; anything else throws and the temp object is removed.

CheckValue
Allowed MIME typesimage/jpeg, image/png, image/webp, image/gif
Max file size1 MB
Presigned URL TTL6 minutes
Output formatJPEG (mozjpeg, q85) by default; PNG when alpha is present
Output dimensions2048 px on the longest side, fit inside

Final outputs are always re-encoded; the original bytes never become the served file. Temp keys live under tmp/ and are deleted in every server-observable failure path. A bucket lifecycle rule that expires tmp/ after 1 day catches the residual case where a user abandons the upload between presign and finalize. See Storage for the full upload pipeline.

Operational Secrets

Three secrets every production deploy must set, plus one header pair for proxied deployments.

SecretWhy it matters
BETTER_AUTH_SECRETEncrypts and signs session cookies. Generate with openssl rand -base64 32 (32+ characters required). Unique per environment.
NEXT_SERVER_ACTIONS_ENCRYPTION_KEYRequired for multi-instance deployments. Without a shared key, replica B can't decrypt server-action signatures created by replica A and you get intermittent form failures.
STRIPE_WEBHOOK_SECRETVerifies inbound Stripe webhooks. The production secret from the Stripe dashboard is distinct from the dev/CLI secret used during local testing.

For deployments behind a load balancer or reverse proxy, set TRUST_PROXY_HEADERS=true and TRUSTED_PROXY_IP_HEADERS=x-forwarded-for so Better Auth sees the real client IP for rate-limit and audit purposes. Render's blueprint sets these by default; Fly's edge-only setup doesn't need them.

Pre-Launch Security Checklist

Generate strong session secrets

Run openssl rand -base64 32 to generate BETTER_AUTH_SECRET. Confirm it is at least 32 characters and unique to production (not copied from dev or a teammate's machine).

Generate the server actions encryption key (multi-instance only)

If you run more than one replica of the web app, generate NEXT_SERVER_ACTIONS_ENCRYPTION_KEY with openssl rand -base64 32 and set the same value on every replica.

Configure Upstash

Set UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN. Required in production: assertValidSetupEnv refuses to start without them, so every abuse-protected surface (uploads, chat, auth emails, contact) stays gated.

Configure Turnstile

Set TURNSTILE_SECRET_KEY and NEXT_PUBLIC_TURNSTILE_SITE_KEY so the captcha plugin attaches to auth endpoints and the contact form gates on token verification.

Set the production Stripe webhook secret

Use the secret from the production webhook endpoint in your Stripe dashboard, not the CLI secret from local testing. Then register the production webhook URL in Stripe.

Confirm dev escapes are unset in production

DISABLE_ABUSE_PROTECTION must be unset (or false) in production: it exists for CI determinism only and is rejected at boot when NODE_ENV=production. The Better Auth per-route rate limits widen only under NODE_ENV=test, so there's no separate flag to verify. There is no runtime env var to skip env validation: next build is detected via PHASE_PRODUCTION_BUILD from next/constants and tests via NODE_ENV=test, so a production server always validates on boot.

Set proxy headers if behind a load balancer

If your platform sits the app behind a load balancer (Render, Kubernetes ingress, custom reverse proxy), set TRUST_PROXY_HEADERS=true and TRUSTED_PROXY_IP_HEADERS=x-forwarded-for. Vercel and Fly's edge-only setups don't need these.

Verify NEXT_PUBLIC_APP_URL matches the deployed origin

CSP origins, CORS allow-origin, and OAuth callback paths all derive from this. A mismatch breaks captcha, S3 uploads, and OAuth in non-obvious ways.

Audit OAuth callback URLs

In the GitHub and Google developer consoles, confirm the registered callback URLs match <NEXT_PUBLIC_APP_URL>/api/auth/callback/github and <NEXT_PUBLIC_APP_URL>/api/auth/callback/google for the production app.

Run setup-doctor against production env

pnpm setup:doctor

Run with the production env loaded (or copy it into a temporary file and pass it via dotenvx). Every protection should report active. Fix anything the doctor flags before opening to real traffic.

Where To Go Next

Was this page helpful?

On this page