AI | SyntaxPath

SyntaxKit ships a complete org-scoped streaming chat at /dashboard/ai-chat on top of the Vercel AI SDK and the Vercel AI Gateway. Everything routes through one adapter, gateway(modelId), so swapping providers is a string change rather than a refactor. The @syntaxkit/ui/components/ai-elements/* package gives you the same composer, transcript, attachments, reasoning, sources, model picker, and mic button to drop into any new AI feature.

We pick the AI Gateway over per-provider SDKs because it removes the "N keys for N providers" problem: one Vercel key, dozens of model ids, one billing surface, one place to switch when a new model lands.

How a chat turn flows

useChat, the oRPC streaming transport, and where persistence happens.

The AI SDK and Gateway

One adapter, one key, model-id strings, and the default model.

AI Elements: building blocks

The reusable transcript, composer, and streaming primitives.

Add a new AI feature

Reuse the handler shape for any streaming or one-shot feature.

How A Chat Turn Flows

A chat turn: PromptInput sends through useChat to /rpc/chat/send, the handler runs billing and abuse checks, persists the user message, calls streamText with gateway(modelId), and the token stream rides back through toUIMessageStream and streamToEventIterator into the Conversation. onFinish persists the assistant message and an AiUsageEvent.

The unusual bit is the transport. The kit pairs @ai-sdk/react's useChat with a custom transport.sendMessages that calls the oRPC streaming procedure client.chat.send and pipes the result through eventIteratorToUnproxiedDataStream from @orpc/client. The server returns streamToEventIterator(result.toUIMessageStream({ sendReasoning: true, sendSources: true })) so reasoning and sources arrive as first-class UI message parts. Persistence happens in onFinish so the user message is saved before the stream and the assistant message + an AiUsageEvent row land once tokens stop flowing.

The diagram source lives at apps/docs/diagrams/ai-chat-flow.mmd. Rerun pnpm --filter @syntaxkit/docs diagrams:build after editing it to refresh both SVG variants.

Package Layout

What's Wired In

Capability	How it's enabled
Streaming chat with reasoning + sources	`result.toUIMessageStream({ sendReasoning: true, sendSources: true })` in `chat.send`
Image attachments	Browser presigns via `storage.presign`, PUTs the bytes, calls `storage.finalize`; the server enforces ownership via `assertOwnedAttachmentParts` (key prefix `images/<userId>/`)
Web search	Pro-only; flips the model to `perplexity/sonar` and disables image attachments for that turn
Multi-model selection	Pro-only; free plan is locked to `CHAT_DEFAULT_MODEL_ID`
Voice transcription	`<SpeechInput>` uses the browser's Web Speech API, or hands a recorded `Blob` to your `onAudioRecorded` callback
Regenerate	`chat.regenerate` deletes the trailing assistant turn(s) and replaces with a fresh stream
Auto-titled chats	First-turn `onFinish` calls `generateText({ model: gateway("openai/gpt-4o-mini") })` and writes `Chat.title` once
Cursor-paginated chat list + search	`chat.list` (cursor) + `chat.search` (case-insensitive title and content)
Cursor-paginated message history	`chat.get` returns the latest `CHAT_DEFAULT_MESSAGES_PAGE_SIZE` messages plus a `nextCursor`; `chat.listMessages` pages older messages from that cursor so a long chat can never produce an unbounded payload
Per-plan monthly response cap	`reserveAiUsageEvent` atomically counts and inserts an `AiUsageEvent` row under a per-organization Postgres advisory lock, so concurrent requests can't overshoot the cap
Abuse protection	Sliding-window Upstash limits keyed by `userId` + `organizationId` for the `chat.send` and `chat.regenerate` surfaces

The AI SDK And Gateway

Two design choices to call out before wiring:

Choice	Why
Vercel AI SDK over per-provider SDKs	One streaming primitive (`streamText`), one prompt format (`ModelMessage`), one tool-calling protocol. Nothing in the kit's chat handler is OpenAI-specific.
AI Gateway over direct provider keys	One key (`AI_GATEWAY_API_KEY`), one billing surface, model-id strings like `openai/gpt-5.2`, `anthropic/claude-haiku-4.5`, `google/gemini-3-flash`, `xai/grok-4.1-fast-non-reasoning`, `perplexity/sonar`. Switch providers by editing a string.

The default model is CHAT_DEFAULT_MODEL_ID in packages/shared/src/schemas/chat.ts. The selectable model list lives in apps/web/components/dashboard/ai-chat/chat-view/models.ts (the models array) so each entry can carry display metadata (chef, provider slug for the logo, label). Adding a model is a two-line edit when you also want it in the picker; it's a zero-line edit when you just want the server to accept it (any string is forwarded to gateway()).

Env vars. The chat path needs exactly one AI-specific env: AI_GATEWAY_API_KEY (recognized by the AI SDK's gateway() adapter). The kit declares it in turbo.json for cache invalidation, but does not yet ship it in apps/web/.env.example.

Add AI_GATEWAY_API_KEY to your environment when you set up the kit. Without it, every AI request fails at the SDK boundary with an authentication error. Get a key from vercel.com/dashboard under "AI Gateway".

The full list of supported providers and current model ids lives at models.dev; whatever string the gateway accepts there will work in gateway(modelId) here.

The Streaming Procedure

chat.send is a single-purpose oRPC procedure (POST /rpc/chat/send) that flows through five gates before it ever opens a model connection. Listed in order:

Validate attachment ownership

assertOwnedAttachmentParts walks every file part on the incoming message and checks each url against NEXT_PUBLIC_S3_PUBLIC_URL. The key must start with images/<userId>/, where <userId> is the session user. This forces every chat attachment to have already gone through the kit's presign + finalize pipeline before it's allowed near the model.

Resolve the model and check billing features

resolveChatModel returns perplexity/sonar when webSearch is on (and rejects an explicit non-default model in that case). Otherwise resolveRequestedModel calls assertBillingFeature(billing, "multiModelAccess", ...) if the requested model is not the default. webSearch itself is gated by assertBillingFeature(billing, "webSearch", ...).

Apply the abuse policy

enforceChatAbusePolicy calls enforceAbusePolicy for the chat.send surface with userId and organizationId characteristics. Sliding-window limits live in packages/shared/src/abuse.ts. Going over returns TOO_MANY_REQUESTS with a retryAfter payload.

Build the model context

buildModelContextMessages walks history newest-first, capped at CHAT_MAX_CONTEXT_MESSAGES (40) and CHAT_MAX_CONTEXT_CHARACTERS (20,000). User parts that include images become structured { type: "image", image: URL, mediaType } content; everything else becomes plain { role, content }. The system prompt is set to "You are a helpful assistant." and lives inline in chat.ts, so changing it is a one-liner.

Reserve the monthly response slot

reserveAiUsageEvent atomically counts the active billing window (the active subscription period, or the current calendar month for the free plan) and inserts a new AiUsageEvent row inside a single prisma.$transaction. The transaction begins with pg_advisory_xact_lock(hashtextextended('ai-usage:<orgId>', 0)), which serialises concurrent reservations for the same organization so the count + insert pair is race-free. Free plans are capped at monthlyAiResponses: 100; Pro is null (unlimited, which skips the lock and just inserts). Going over throws CONFLICT. assertWithinAiResponseLimit is still exported as a non-mutating predicate for surfaces that need a soft check (e.g. dashboards), but the chat router never relies on it for enforcement.

If the model call later errors, streamText.onError (and a synchronous try/catch around streamText) issues a best-effort prisma.aiUsageEvent.delete to refund the reservation, so a failed turn doesn't consume the user's quota.

After the gates pass, the user Message is persisted, then:

const result = streamText({
  model: gateway(modelId),
  system: "You are a helpful assistant.",
  messages: modelMessages,
  onError: async () => {
    // Refund the reservation row so a failed turn does not consume the cap.
  },
  onFinish: async ({ text }) => {
    // Persist the assistant Message and (on the first turn)
    // generateText({ model: gateway("openai/gpt-4o-mini") }) to set
    // Chat.title and flip Chat.titleGenerated. The AiUsageEvent row has
    // already been written by reserveAiUsageEvent before streaming began.
  },
});

return streamToEventIterator(
  result.toUIMessageStream({
    sendReasoning: true,
    sendSources: true,
  })
);

chat.regenerate mirrors this: same gates, same streamText shape, but onFinish deletes the trailing assistant turn(s) before inserting the replacement and writes an AiUsageEvent with kind: "chat_regenerate" instead of "chat_send".

The Client: useChat With An oRPC Transport

The kit uses @ai-sdk/react's useChat over an oRPC streaming procedure rather than a plain fetch endpoint. The bridge is one helper:

import { useChat } from "@ai-sdk/react";
import { eventIteratorToUnproxiedDataStream } from "@orpc/client";
import { client } from "@/lib/orpc";

const { messages, sendMessage, status, stop } = useChat({
  id: chatId,
  messages: seedMessages,
  transport: {
    async sendMessages(options) {
      const latestMessage = options.messages[options.messages.length - 1];
      return eventIteratorToUnproxiedDataStream(
        await client.chat.send(
          {
            chatId: options.chatId,
            messages: buildSendPayloadMessages(latestMessage),
            model: webSearchRef.current ? undefined : modelRef.current,
            webSearch: webSearchRef.current,
          },
          { signal: options.abortSignal }
        )
      );
    },
    reconnectToStream() {
      throw new Error("Unsupported");
    },
  },
  onFinish: () => {
    // Invalidate the sidebar list and dashboard stats.
  },
});

A few patterns worth knowing:

Pattern	Where
`seedMessages`	`[chatId]/page.tsx` server-prefetches `chat.get` (latest `CHAT_DEFAULT_MESSAGES_PAGE_SIZE` messages plus a `nextCursor`); `ChatView` reads it with `useSuspenseQuery` and maps DB rows to `UIMessage`s once via `dbMessagesToUIMessages` keyed to `chatId`. Older history is fetched on demand via `chat.listMessages` and prepended through `useChat`'s `setMessages`.
`status` / `stop`	Pass `status` to `<PromptInputSubmit>` so the button toggles between submit, stop, and pending
Regenerate	Drains `client.chat.regenerate` outside `useChat` (a plain `for await`) and ends with `router.refresh()` so RSC data updates
`initialPrompt`	The `?prompt=` URL param from the new-chat hub auto-sends once when `status === "ready"`, then `router.replace` strips the query
Refs over closures	`modelRef` and `webSearchRef` ensure the transport reads the latest user choice, since `useChat` captures the transport once per `id`

AI Elements: Building Blocks

Reusable presentation primitives. Group them by what they do, not what file they live in. All export from @syntaxkit/ui/components/ai-elements/<file>.

Transcript Primitives

Conversation

Scrollable shell with stick-to-bottom behavior, empty state, jump-to-bottom button, and Markdown download. Pure UI; takes a simple { role, content }[] for download.

Message

Row layout for one chat turn (user vs assistant styling). Required prop: from: UIMessage['role']. Sub-components: MessageContent, MessageActions, MessageBranch*.

MessageResponse

Memoized assistant body that renders Markdown via Streamdown (CJK, code highlighting, math, mermaid). Use it as the children of MessageContent for assistant turns.

MessageBranch

Optional 'multiple drafts' switcher for regenerated turns: MessageBranchSelector, MessageBranchPrevious, MessageBranchNext, MessageBranchPage.

Composer Primitives

PromptInput

Form shell with hidden file input, drag-and-drop, paste support, and an attachments context. Required prop: onSubmit(message: PromptInputMessage, event).

PromptInputTextarea / Submit / Tools

Layout slots that compose inside PromptInput: textarea with sensible defaults, submit/stop button driven by ChatStatus, tool-row container for buttons.

Attachments

Grid, inline, or list layouts for FileUIPart and SourceDocumentUIPart. Pair with usePromptInputAttachments() inside PromptInput to render staged uploads.

ModelSelector

Command-palette dialog model picker. ModelSelectorLogo pulls provider art from models.dev. Pure UI; the kit hides it for free plans.

SpeechInput

Mic button. Uses the browser Web Speech API when available and supported; otherwise records audio and defers transcription to your onAudioRecorded callback.

Suggestions

Horizontally scrollable row of pill buttons for starter prompts. Used by the new-chat hub and as an empty state inside chats.

Streaming-Aware Primitives

Reasoning

Collapsible thinking block. Tracks streaming duration, animates the trigger label with Shimmer, and renders the body via Streamdown. Wire isStreaming to your useChat status.

Sources

Collapsible citations list. Trigger shows 'Used N sources'; Source is a styled external-link row. Render this when a UIMessage has source-url parts.

Message, PromptInput, Attachments, and PromptInputSubmit lean on ai-package types (UIMessage, ChatStatus, FileUIPart, SourceDocumentUIPart). The rest are pure presentation, so you can use them outside useChat (for one-shot generations or non-chat AI features).

Billing And Limits

Two limit surfaces gate AI usage. Per-plan features control what a user is allowed to do; per-request limits control what fits in a single payload.

Per-plan features

Feature	Free	Pro
`monthlyAiResponses`	100 / month	Unlimited (`null`)
`multiModelAccess` (model picker beyond default)	No	Yes
`webSearch` (Perplexity Sonar)	No	Yes

The cap is enforced by counting AiUsageEvent rows (kind: "chat_send" or "chat_regenerate") in the active billing window. See Billing for how plans and entitlements are configured, and Storage: How An Upload Flows for the attachment pipeline that feeds chat images.

Per-request limits

All defined in packages/shared/src/schemas/chat.ts and enforced by Zod on the server.

Constant	Value	What it bounds
`CHAT_MAX_MESSAGES_PER_REQUEST`	1	Only the latest user turn is sent on each call
`CHAT_MAX_PARTS_PER_MESSAGE`	4	Text + file parts per message
`CHAT_MAX_TEXT_LENGTH_PER_PART`	4,000	Characters in any single text part
`CHAT_MAX_USER_MESSAGE_TEXT_LENGTH`	4,000	Total characters across all text parts in one message
`CHAT_MAX_CONTEXT_MESSAGES`	40	History the server includes when calling the model
`CHAT_MAX_CONTEXT_CHARACTERS`	20,000	Total history characters across included messages
`CHAT_DEFAULT_MESSAGES_PAGE_SIZE`	50	Messages returned per page from `chat.get` and `chat.listMessages`
`CHAT_MAX_MESSAGES_PAGE_SIZE`	100	Hard cap for the per-page message limit

History is trimmed newest-first inside buildModelContextMessages, so an active conversation always keeps its tail and the trim shows up only on long threads.

Abuse Protection

chat.send and chat.regenerate are surfaces in the shared abuse policy. Each request is keyed by userId and organizationId; both characteristics must be present, otherwise the surface fails closed.

When Upstash Redis is not configured, the chat handler logs a single warning per process and continues without rate limits. This is intentional for local development. Configure Upstash before going live; otherwise nothing throttles a runaway client. See Security: Abuse Protection (Upstash).

Tuning happens in packages/shared/src/abuse.ts: each surface declares its window and limit per characteristic, so you can lower the per-user cap without changing the per-org cap.

Adding A New AI Feature

Reuse the same handler shape for a new feature (summarize a doc, generate alt text, draft an email, anything). The pattern is the streaming chat in miniature.

Define the schema in `packages/shared`

Drop a new file at packages/shared/src/schemas/<feature>.ts with Zod input/output schemas, plus any per-request constants (max input length, etc.). Re-export from packages/shared/src/schemas/index.ts so the API and the client see them.

export const summarizeInputSchema = z.object({
  text: z.string().min(1).max(50_000),
  style: z.enum(["bullet", "tldr", "executive"]).default("tldr"),
});

Add usage accounting (optional)

If the feature should count toward a quota, add a model to packages/database/prisma/models/ai.prisma (mirror AiUsageEvent) or extend AiUsageEvent.kind with a new value, then run a migration. Skip this for free, internal-only features.

Write the oRPC procedure

Reuse organizationChatProcedure-style middleware: org-scoped auth, billing assertion, abuse gate, then streamText({ model: gateway(modelId) }) for streams or generateText for one-shot.

import { CHAT_DEFAULT_MODEL_ID } from "@syntaxkit/shared";

export const summarize = authorized
  .use(withActiveOrganization)
  .route({ path: "/summarize", method: "POST" })
  .input(summarizeInputSchema)
  .handler(async ({ context, input }) => {
    const billing = await getBillingState(context.organization.id);
    // Atomically reserve a slot in the monthly cap before spending
    // gateway tokens. Refund the reservation if the model errors.
    const usage = await reserveAiUsageEvent(billing, context.organization.id, {
      kind: "chat_send",
      chatId: null,
      createdByUserId: context.user.id,
    });
    try {
      const result = streamText({
        model: gateway(CHAT_DEFAULT_MODEL_ID),
        system: "Summarize the input in the requested style.",
        prompt: input.text,
        onError: () =>
          prisma.aiUsageEvent
            .delete({ where: { id: usage.id } })
            .catch(() => {}),
      });
      return streamToEventIterator(result.toUIMessageStream());
    } catch (error) {
      await prisma.aiUsageEvent
        .delete({ where: { id: usage.id } })
        .catch(() => {});
      throw error;
    }
  });

Pick the right return shape

Streams: return streamToEventIterator(result.toUIMessageStream(...)). One-shots: return { text } from a plain generateText call. Streams give you token-by-token UI; one-shots give you a single mutation result.

Wire the client

For streams, use useChat with the same transport.sendMessages pattern (point it at your new procedure). For one-shots, a plain useMutation(orpc.<feature>.run.mutationOptions()) is fine; the response is a typed object, no streaming bookkeeping.

Compose the UI from `ai-elements`

Drop in <PromptInput> for input and <Conversation> + <Message> + <MessageResponse> for output. Pull in <Reasoning> and <Sources> if your model returns them. None of these primitives are chat-specific; they work for any UIMessage-shaped surface.

AI