Skip to content

Feature: Trip AI Chat — per-trip assistant in the AI Chat tab

ID: 0012
Status: Draft
Owner: @satya
Created: 2026-05-15
Updated: 2026-05-15
Related ADRs: 0002 (Supabase + Nest), 0003 (Python workers for AI), 0009 (TBD — streaming over SSE via Nest)
Depends on: 0001 (trip planner), 0008 (reel imports — for trip context), 0011 (notifications — reuses session conventions)
Supersedes: —

The fourth tab on the trip detail PageView is currently a “coming soon” placeholder (apps/mobile/lib/features/trips/view/ai_chat_tab.dart). Once a trip exists, the user has nowhere to ask questions about it (“what’s day 3 looking like?”, “any veg-friendly food spots near Shibuya?”, “summarise the budget”), nor any conversational way to lean on AI for planning help without leaving the screen and going to imports. A per-trip assistant fills that gap. It is intentionally trip-scoped — consistent with the product vision’s “AI as a power-tool, not a chatbot” stance (specs/product-vision.md) — every session is bound to one trip and grounded in that trip’s data.

  • P1 — Solo planner: needs quick answers about their own trip without scrolling tabs (“when do I land in Tokyo?”, “how much have I budgeted for day 4?”).
  • P3 — Inspiration hoarder: after a reel import (spec 0008), wants to ask follow-ups (“what else is near this café?”) without leaving the trip.
  • P4 — Active traveller (deferred): on-trip mode benefits later; v0 still works fine in live mode but no special handling.

Out: P2 (group lead) — multi-user shared chat is explicitly out of scope for v0; sessions are per-user.

  • F0012.1 Trip-scoped chat surface in the AI Chat tab.

    • F0012.1.a Replace placeholder body in ai_chat_tab.dart with a TripChatCubit-backed conversation list + composer.
    • F0012.1.b One default chat session per (trip_id, user_id) — auto-created on first visit. Multiple sessions per trip are out of v0 scope (table supports it; UI does not).
    • F0012.1.c Composer: multiline text input, send button, attach button (image from camera/gallery), inline cancel for streaming responses.
    • F0012.1.d Empty state: 3–5 suggested starter prompts derived from the loaded trip (e.g. “Summarise day 1”, “What’s the food plan?”, “What’s missing?”).
  • F0012.2 Persistent history in Supabase.

    • F0012.2.a trip_chat_sessions and trip_chat_messages tables (see §7), RLS-scoped to trip members.
    • F0012.2.b GET /v1/trips/:tripId/chat/sessions/default returns (and lazily creates) the user’s session for that trip.
    • F0012.2.c GET /v1/chat/sessions/:id/messages?limit=50&before=… paginates oldest-first within the page, newest page first; client renders most-recent at the bottom.
  • F0012.3 Streamed responses through Nest → workers → LiteLLM proxy.

    • F0012.3.a POST /v1/chat/sessions/:id/messages accepts a user message (+ optional attachment refs), persists it status='sent', then opens a Server-Sent Events stream and relays token deltas from workers as data: events.
    • F0012.3.b Workers exposes POST /ai/chat/stream (auth: workers token, called only by Nest). Builds the prompt from trip_context_snapshot + last N messages, calls LiteLLM with streaming enabled, yields SSE delta / done / error frames.
    • F0012.3.c On done, workers persists the assistant message (with model, tokens_in, tokens_out, cost_usd) and Nest closes the SSE.
    • F0012.3.d Cancellation: client disconnect → Nest aborts upstream request → workers persists a partial assistant message with status='cancelled' and the prefix received so far.
  • F0012.4 Trip context injection (“grounding”).

    • F0012.4.a Workers builds a compact TripContextSnapshot once per request: trip title, dates, destinations, day-by-day activities (kind, title, time, cost), cost rollup, traveller list. Hard cap at ~6k tokens; truncate oldest/lowest-priority activities first.
    • F0012.4.b Snapshot is read from Supabase via the user’s JWT (Nest forwards the bearer; workers uses anon client + that JWT) so RLS still applies — never the service-role client for this path.
    • F0012.4.c System prompt names the trip and instructs the model to refuse / clarify when asked about anything outside the trip scope; no general-purpose chatbot behaviour.
  • F0012.5 Attachments (read-only, image-only in v0).

    • F0012.5.a POST /v1/trips/:tripId/chat/sign-upload mints a signed PUT URL into trip-attachments bucket under <trip_id>/chat/<session_id>/<message_id>/….
    • F0012.5.b Client uploads, then sends the message with attachments: [{path, mime, bytes}]. Workers downloads via service role, base64-encodes, sends as vision parts (reusing LiteLLMClient.vision-style multimodal path adapted for chat).
    • F0012.5.c Max 4 images per message, ≤8 MB each, jpeg/png/webp.
  • F0012.6 Cost + safety guards.

    • F0012.6.a Per-message ceiling: $0.02 default, configurable via workers settings.chat_cost_ceiling_usd. Refusal message persisted on breach with status='failed'.
    • F0012.6.b Per-session daily ceiling: $0.50; subsequent sends return 429 with a friendly error.
    • F0012.6.c Model defaults to claude-haiku-4-5-20251001 for text-only, claude-sonnet-4-5-20250929 when attachments are present. Both routed through LiteLLM proxy (apps/workers/src/treeper_workers/ai/llm.py).
  • Write-back / tool-calling that mutates the trip. The assistant can suggest, never auto-edit. A future spec (0013-trip-ai-actions) adds tool calls to insert/modify activities with a confirm step.
  • Shared / multi-user sessions. Each user has their own thread.
  • Cross-trip chat (“plan a brand new trip for me”) — handled by the prompt-to-itinerary spec (F3.5), not here.
  • Voice / audio input or TTS playback.
  • Web search / live retrieval. v0 grounds only on Supabase trip data and the model’s parametric knowledge.
  • End-of-trip recap generation (F2.6) — owned by tracker spec.
  • As P1, I open my trip → AI Chat tab → see a fresh thread with starter chips → tap “Summarise day 1” → get a streamed answer grounded in my actual itinerary within seconds.
  • As P1, I close the app mid-stream; reopening shows my user message preserved and the partial assistant message marked as cancelled, with a “retry” affordance.
  • As P3, after a reel imports successfully, I switch to AI Chat and ask “any veg restaurants near these spots?” — the assistant uses the activities I just imported, not generic city advice.
  • As P1, I attach a photo of a paper itinerary and ask “is this already in my plan?” — assistant compares against the trip and flags overlaps.
  • As P1, I send 100 messages in a day on one trip — the 101st returns a clear “daily AI budget reached, resets at midnight UTC”.
  • Layout (replaces current placeholder; same surfaceCream background, design-system tokens):
    • Top bar: implicit (uses the existing trip AppBar) — no extra title needed.
    • Message list: reverse-chronological in code, visually bottom-anchored. User bubbles right-aligned with accentLime fill; assistant bubbles left-aligned, surface-card. Cancelled / failed bubbles get a muted treatment + small footer line.
    • Streaming bubble shows a typing-dot indicator until first delta, then progressively fills.
    • Composer pinned to the bottom (above the floating bottom nav — keep 120 px nav clearance pattern already used in the placeholder).
  • Starter chips (empty state only): wrap-row of 3–5 chips; tap drops the prompt straight into the composer (does not auto-send).
  • Attachments: bottom-sheet picker reusing activity_attachments_sheet.dart patterns; pill row of thumbnails above the composer before sending.
  • Error states:
    • Network drop mid-stream → snackbar + retry button on the bubble.
    • Budget breach → inline assistant bubble with system styling, no retry.
  • Accessibility: VoiceOver/TalkBack reads new assistant messages on stream completion (not every delta); send button is the primary focus target after send-success.
AC-1 F0012.1.a Visiting the AI Chat tab on a trip with no prior
history shows the empty state and starter chips
within 200 ms of the cubit's first emit.
AC-2 F0012.1.b First send creates exactly one row in
trip_chat_sessions for (trip_id, user_id); a second
visit reuses the same session id.
AC-3 F0012.2.c GET /v1/chat/sessions/:id/messages returns the user
and assistant messages of the prior turn in
created_at order; tokens_in/out/cost_usd are
populated on the assistant row.
AC-4 F0012.3.a POST /v1/chat/sessions/:id/messages opens an SSE
response; the first `data:` frame arrives within
1.5 s p95 on the staging proxy.
AC-5 F0012.3.d Disconnecting the SSE mid-stream produces an
assistant row with status='cancelled' and `content`
equal to the prefix already streamed (no nulls, no
dupes).
AC-6 F0012.4.b A user without trip membership receives 403 on the
messages endpoint; the workers context-snapshot
query returns zero rows under their JWT (RLS proof).
AC-7 F0012.4.c Asked "what's the weather in Paris today?" on a
Tokyo trip, the assistant declines or redirects to
the trip — no general-knowledge answer is returned
(manual eval, ≥4/5 prompts).
AC-8 F0012.5.b An image attachment is downloaded via service role,
sent to the vision-capable model, and the
attachment row links the storage path; assistant
references the image content in its reply.
AC-9 F0012.6.a A message whose run exceeds the per-message ceiling
stops streaming, persists status='failed' with the
partial content, and returns a 200 SSE terminated
by an `error` frame (not 5xx).
AC-10 F0012.6.b The 51st message in a single UTC day on one
session returns 429 with code `daily_budget_reached`.
AC-11 F0012.3.c Across 20 sequential turns, no orphan rows are
produced (every `user` message has either a paired
`assistant` row or a `failed`/`cancelled` row).

Table sketch only; real schema lives in infra/supabase/migrations/0020_trip_chat.sql.

trip_chat_sessions
id uuid pk
trip_id uuid fk → trips(id) on delete cascade
user_id uuid fk → auth.users(id)
title text -- short auto-generated label
model text -- last model used
created_at timestamptz default now()
updated_at timestamptz
unique (trip_id, user_id) -- v0: one session per (trip,user)
trip_chat_messages
id uuid pk
session_id uuid fk → trip_chat_sessions(id) on delete cascade
role chat_role enum ('user','assistant','system','tool')
content text -- final text (may be partial if cancelled/failed)
parts jsonb -- multimodal/tool parts, future-proof
status chat_message_status enum -- 'sent','streaming','done','cancelled','failed'
error text
model text
tokens_in int
tokens_out int
cost_usd numeric(10,4)
created_at timestamptz default now()
trip_chat_attachments
id uuid pk
message_id uuid fk → trip_chat_messages(id) on delete cascade
storage_path text -- 'trip-attachments' bucket key
mime text
bytes bigint
created_at timestamptz default now()

RLS (mirrors existing trip-scoped policies):

  • trip_chat_sessions: select/insert/update where auth.uid() = user_id AND user is a member of trip_id (reuse is_trip_member(trip_id, auth.uid()) helper from earlier migrations).
  • trip_chat_messages: select/insert via session membership check.
  • trip_chat_attachments: same, joined through message → session.
  • Workers’ service role bypasses RLS as today; user-context calls go through the user JWT (see F0012.4.b).

Storage: reuse the trip-attachments bucket; path prefix <trip_id>/chat/<session_id>/<message_id>/…. Existing bucket policies already grant trip-member read/write on that prefix — no new bucket needed.

Backend (Nest, JWT-guarded, mounted under existing v1 prefix):

GET /v1/trips/:tripId/chat/sessions/default
→ { session: TripChatSession } (creates lazily)
GET /v1/chat/sessions/:id/messages?limit=50&before=<iso>
→ { messages: TripChatMessage[], hasMore: boolean }
POST /v1/chat/sessions/:id/messages
body: { content: string, attachments?: AttachmentRef[] }
response: text/event-stream
event: ack data: { user_message_id, assistant_message_id }
event: delta data: { text }
event: done data: { tokens_in, tokens_out, cost_usd, model }
event: error data: { code, message }
POST /v1/trips/:tripId/chat/sign-upload
body: { mime, bytes, message_session_id }
→ { uploadUrl, storagePath, expiresAt }
DELETE /v1/chat/sessions/:id (purge entire session for the user)

Workers (workers-token-guarded, called only by Nest):

POST /ai/chat/stream
body: { session_id, trip_id, user_jwt, history_cutoff, attachments? }
response: text/event-stream (delta | done | error)

Mobile (Flutter):

  • TripChatRepositoryTripChatRemoteSource (Dio) + TripChatStreamClient (SSE; flutter_client_sse or hand-rolled over http.Client.send).
  • TripChatCubit states: Initial | Loading | Ready(messages, sending?, streamingId?) | Failure.
ConcernTarget
Time to first token≤1.5 s p95 (staging LiteLLM → Haiku); ≤3.0 s p95 with image
Full turn latency≤6 s p95 text-only at ~300 output tokens
Cost≤$0.02/message average; ≤$0.50/session/day hard cap
OfflineTab shows an empty/offline state with composer disabled and a “reconnect” hint; no local Drift cache in v0 (see Q3 / R-deferred)
PrivacyRLS enforced via user JWT for context snapshot read; service role only used for storage download + history writes
ObservabilityStructlog events chat.turn.started/streamed/finished/failed with session_id, trip_id, tokens, cost, model
AccessibilityScreen-reader announces assistant reply once (on done), not per delta
Failure isolationA failing LiteLLM call must persist a failed row + error string; never leave a streaming row to rot
  • Q1: SSE through Nest vs direct mobile→workers? Going through Nest preserves the single auth boundary and matches every other feature, at the cost of one extra hop. Decision pending; if latency budget is missed, fold into ADR 0009.
  • Q2: Do we need a true streaming chat method on LiteLLMClient? Current client is instructor-only (structured). Likely add a parallel chat_stream(...) -> AsyncIterator[str] next to complete() rather than refactoring instructor.
  • Q3: Local Drift schema for cached messages — resolved: v0 skips local caching. History is fetched from Supabase on tab open; offline = read-only empty state. Revisit if users complain.
  • R1: Token blow-out on long trips. Mitigation: snapshot cap + last N (≤20) message rolling window; older messages summarised by a cheap pre-pass only when the window is full (defer to v0.1 if not needed).
  • R2: Prompt injection from imported reels’ captions appearing in trip data. Mitigation: snapshot renders trip data inside a clearly fenced block with an instruction to ignore directives within it; add a regression eval set (≥10 prompts) before shipping.
  • R3: Per-user daily cap is global per session; abusive users could create new trips to multiply quota. Track usage by user_id too (column on user_prefs or a separate ai_usage_daily table) — add in a follow-up if abuse appears.
  • R4: Cancellation race — client retries while workers still streaming the previous reply. Mitigation: assistant message id is minted up-front (returned in ack) and used as an idempotency key; second POST with the same in-flight assistant id is rejected.
  1. Migration 0020 — tables + RLS, no behaviour change. Ship independent of clients.
  2. Workers/ai/chat/stream endpoint behind a feature env CHAT_ENABLED=true. Smoke-tested with httpx --stream and a golden trip fixture.
  3. Backend/v1/chat/... endpoints, SSE relay, sign-upload. Integration tests cover AC-2, AC-4, AC-9, AC-10 with a stubbed workers fake.
  4. Mobile — replace AiChatTab placeholder with the real widget tree behind a remote chat_enabled flag (or local debug toggle if no flag service yet). Internal dogfood first.
  5. GA — flip the flag globally; AppBar / tab icon unchanged.

No data backfill needed (history begins on launch). Rollback = flip flag off + keep tables (no destructive change).