Feature: Trip AI Chat — per-trip assistant in the AI Chat tab
ID: 0012Status: DraftOwner: @satyaCreated: 2026-05-15Updated: 2026-05-15Related ADRs: 0002 (Supabase + Nest), 0003 (Python workers for AI), 0009 (TBD — streaming over SSE via Nest)Depends on: 0001 (trip planner), 0008 (reel imports — for trip context), 0011 (notifications — reuses session conventions)Supersedes: —1. Why
Section titled “1. Why”The fourth tab on the trip detail PageView is currently a “coming soon” placeholder (apps/mobile/lib/features/trips/view/ai_chat_tab.dart). Once a trip exists, the user has nowhere to ask questions about it (“what’s day 3 looking like?”, “any veg-friendly food spots near Shibuya?”, “summarise the budget”), nor any conversational way to lean on AI for planning help without leaving the screen and going to imports. A per-trip assistant fills that gap. It is intentionally trip-scoped — consistent with the product vision’s “AI as a power-tool, not a chatbot” stance (specs/product-vision.md) — every session is bound to one trip and grounded in that trip’s data.
2. Who it is for
Section titled “2. Who it is for”- P1 — Solo planner: needs quick answers about their own trip without scrolling tabs (“when do I land in Tokyo?”, “how much have I budgeted for day 4?”).
- P3 — Inspiration hoarder: after a reel import (spec 0008), wants to ask follow-ups (“what else is near this café?”) without leaving the trip.
- P4 — Active traveller (deferred): on-trip mode benefits later; v0 still works fine in live mode but no special handling.
Out: P2 (group lead) — multi-user shared chat is explicitly out of scope for v0; sessions are per-user.
3. Scope
Section titled “3. Scope”In scope
Section titled “In scope”-
F0012.1Trip-scoped chat surface in the AI Chat tab.F0012.1.aReplace placeholder body inai_chat_tab.dartwith aTripChatCubit-backed conversation list + composer.F0012.1.bOne default chat session per(trip_id, user_id)— auto-created on first visit. Multiple sessions per trip are out of v0 scope (table supports it; UI does not).F0012.1.cComposer: multiline text input, send button, attach button (image from camera/gallery), inline cancel for streaming responses.F0012.1.dEmpty state: 3–5 suggested starter prompts derived from the loaded trip (e.g. “Summarise day 1”, “What’s the food plan?”, “What’s missing?”).
-
F0012.2Persistent history in Supabase.F0012.2.atrip_chat_sessionsandtrip_chat_messagestables (see §7), RLS-scoped to trip members.F0012.2.bGET /v1/trips/:tripId/chat/sessions/defaultreturns (and lazily creates) the user’s session for that trip.F0012.2.cGET /v1/chat/sessions/:id/messages?limit=50&before=…paginates oldest-first within the page, newest page first; client renders most-recent at the bottom.
-
F0012.3Streamed responses through Nest → workers → LiteLLM proxy.F0012.3.aPOST /v1/chat/sessions/:id/messagesaccepts a user message (+ optional attachment refs), persists itstatus='sent', then opens a Server-Sent Events stream and relays token deltas from workers asdata:events.F0012.3.bWorkers exposesPOST /ai/chat/stream(auth: workers token, called only by Nest). Builds the prompt fromtrip_context_snapshot+ last N messages, calls LiteLLM with streaming enabled, yields SSEdelta/done/errorframes.F0012.3.cOndone, workers persists the assistant message (withmodel,tokens_in,tokens_out,cost_usd) and Nest closes the SSE.F0012.3.dCancellation: client disconnect → Nest aborts upstream request → workers persists a partial assistant message withstatus='cancelled'and the prefix received so far.
-
F0012.4Trip context injection (“grounding”).F0012.4.aWorkers builds a compactTripContextSnapshotonce per request: trip title, dates, destinations, day-by-day activities (kind, title, time, cost), cost rollup, traveller list. Hard cap at ~6k tokens; truncate oldest/lowest-priority activities first.F0012.4.bSnapshot is read from Supabase via the user’s JWT (Nest forwards the bearer; workers uses anon client + that JWT) so RLS still applies — never the service-role client for this path.F0012.4.cSystem prompt names the trip and instructs the model to refuse / clarify when asked about anything outside the trip scope; no general-purpose chatbot behaviour.
-
F0012.5Attachments (read-only, image-only in v0).F0012.5.aPOST /v1/trips/:tripId/chat/sign-uploadmints a signed PUT URL intotrip-attachmentsbucket under<trip_id>/chat/<session_id>/<message_id>/….F0012.5.bClient uploads, then sends the message withattachments: [{path, mime, bytes}]. Workers downloads via service role, base64-encodes, sends as vision parts (reusingLiteLLMClient.vision-style multimodal path adapted for chat).F0012.5.cMax 4 images per message, ≤8 MB each, jpeg/png/webp.
-
F0012.6Cost + safety guards.F0012.6.aPer-message ceiling: $0.02 default, configurable via workerssettings.chat_cost_ceiling_usd. Refusal message persisted on breach withstatus='failed'.F0012.6.bPer-session daily ceiling: $0.50; subsequent sends return 429 with a friendly error.F0012.6.cModel defaults toclaude-haiku-4-5-20251001for text-only,claude-sonnet-4-5-20250929when attachments are present. Both routed through LiteLLM proxy (apps/workers/src/treeper_workers/ai/llm.py).
Out of scope (this milestone)
Section titled “Out of scope (this milestone)”- Write-back / tool-calling that mutates the trip. The assistant
can suggest, never auto-edit. A future spec (
0013-trip-ai-actions) adds tool calls to insert/modify activities with a confirm step. - Shared / multi-user sessions. Each user has their own thread.
- Cross-trip chat (“plan a brand new trip for me”) — handled by the prompt-to-itinerary spec (F3.5), not here.
- Voice / audio input or TTS playback.
- Web search / live retrieval. v0 grounds only on Supabase trip data and the model’s parametric knowledge.
- End-of-trip recap generation (F2.6) — owned by tracker spec.
4. User stories
Section titled “4. User stories”- As P1, I open my trip → AI Chat tab → see a fresh thread with starter chips → tap “Summarise day 1” → get a streamed answer grounded in my actual itinerary within seconds.
- As P1, I close the app mid-stream; reopening shows my user message preserved and the partial assistant message marked as cancelled, with a “retry” affordance.
- As P3, after a reel imports successfully, I switch to AI Chat and ask “any veg restaurants near these spots?” — the assistant uses the activities I just imported, not generic city advice.
- As P1, I attach a photo of a paper itinerary and ask “is this already in my plan?” — assistant compares against the trip and flags overlaps.
- As P1, I send 100 messages in a day on one trip — the 101st returns a clear “daily AI budget reached, resets at midnight UTC”.
5. UX notes
Section titled “5. UX notes”- Layout (replaces current placeholder; same
surfaceCreambackground, design-system tokens):- Top bar: implicit (uses the existing trip AppBar) — no extra title needed.
- Message list: reverse-chronological in code, visually
bottom-anchored. User bubbles right-aligned with
accentLimefill; assistant bubbles left-aligned, surface-card. Cancelled / failed bubbles get a muted treatment + small footer line. - Streaming bubble shows a typing-dot indicator until first delta, then progressively fills.
- Composer pinned to the bottom (above the floating bottom nav — keep 120 px nav clearance pattern already used in the placeholder).
- Starter chips (empty state only): wrap-row of 3–5 chips; tap drops the prompt straight into the composer (does not auto-send).
- Attachments: bottom-sheet picker reusing
activity_attachments_sheet.dartpatterns; pill row of thumbnails above the composer before sending. - Error states:
- Network drop mid-stream → snackbar + retry button on the bubble.
- Budget breach → inline assistant bubble with system styling, no retry.
- Accessibility: VoiceOver/TalkBack reads new assistant messages on stream completion (not every delta); send button is the primary focus target after send-success.
6. Acceptance criteria
Section titled “6. Acceptance criteria”AC-1 F0012.1.a Visiting the AI Chat tab on a trip with no prior history shows the empty state and starter chips within 200 ms of the cubit's first emit.AC-2 F0012.1.b First send creates exactly one row in trip_chat_sessions for (trip_id, user_id); a second visit reuses the same session id.AC-3 F0012.2.c GET /v1/chat/sessions/:id/messages returns the user and assistant messages of the prior turn in created_at order; tokens_in/out/cost_usd are populated on the assistant row.AC-4 F0012.3.a POST /v1/chat/sessions/:id/messages opens an SSE response; the first `data:` frame arrives within 1.5 s p95 on the staging proxy.AC-5 F0012.3.d Disconnecting the SSE mid-stream produces an assistant row with status='cancelled' and `content` equal to the prefix already streamed (no nulls, no dupes).AC-6 F0012.4.b A user without trip membership receives 403 on the messages endpoint; the workers context-snapshot query returns zero rows under their JWT (RLS proof).AC-7 F0012.4.c Asked "what's the weather in Paris today?" on a Tokyo trip, the assistant declines or redirects to the trip — no general-knowledge answer is returned (manual eval, ≥4/5 prompts).AC-8 F0012.5.b An image attachment is downloaded via service role, sent to the vision-capable model, and the attachment row links the storage path; assistant references the image content in its reply.AC-9 F0012.6.a A message whose run exceeds the per-message ceiling stops streaming, persists status='failed' with the partial content, and returns a 200 SSE terminated by an `error` frame (not 5xx).AC-10 F0012.6.b The 51st message in a single UTC day on one session returns 429 with code `daily_budget_reached`.AC-11 F0012.3.c Across 20 sequential turns, no orphan rows are produced (every `user` message has either a paired `assistant` row or a `failed`/`cancelled` row).7. Data model
Section titled “7. Data model”Table sketch only; real schema lives in
infra/supabase/migrations/0020_trip_chat.sql.
trip_chat_sessions id uuid pk trip_id uuid fk → trips(id) on delete cascade user_id uuid fk → auth.users(id) title text -- short auto-generated label model text -- last model used created_at timestamptz default now() updated_at timestamptz unique (trip_id, user_id) -- v0: one session per (trip,user)
trip_chat_messages id uuid pk session_id uuid fk → trip_chat_sessions(id) on delete cascade role chat_role enum ('user','assistant','system','tool') content text -- final text (may be partial if cancelled/failed) parts jsonb -- multimodal/tool parts, future-proof status chat_message_status enum -- 'sent','streaming','done','cancelled','failed' error text model text tokens_in int tokens_out int cost_usd numeric(10,4) created_at timestamptz default now()
trip_chat_attachments id uuid pk message_id uuid fk → trip_chat_messages(id) on delete cascade storage_path text -- 'trip-attachments' bucket key mime text bytes bigint created_at timestamptz default now()RLS (mirrors existing trip-scoped policies):
trip_chat_sessions: select/insert/update whereauth.uid() = user_id AND user is a member of trip_id(reuseis_trip_member(trip_id, auth.uid())helper from earlier migrations).trip_chat_messages: select/insert via session membership check.trip_chat_attachments: same, joined through message → session.- Workers’ service role bypasses RLS as today; user-context calls go through the user JWT (see F0012.4.b).
Storage: reuse the trip-attachments bucket; path prefix
<trip_id>/chat/<session_id>/<message_id>/…. Existing bucket policies
already grant trip-member read/write on that prefix — no new bucket
needed.
8. APIs / contracts
Section titled “8. APIs / contracts”Backend (Nest, JWT-guarded, mounted under existing v1 prefix):
GET /v1/trips/:tripId/chat/sessions/default → { session: TripChatSession } (creates lazily)
GET /v1/chat/sessions/:id/messages?limit=50&before=<iso> → { messages: TripChatMessage[], hasMore: boolean }
POST /v1/chat/sessions/:id/messages body: { content: string, attachments?: AttachmentRef[] } response: text/event-stream event: ack data: { user_message_id, assistant_message_id } event: delta data: { text } event: done data: { tokens_in, tokens_out, cost_usd, model } event: error data: { code, message }
POST /v1/trips/:tripId/chat/sign-upload body: { mime, bytes, message_session_id } → { uploadUrl, storagePath, expiresAt }
DELETE /v1/chat/sessions/:id (purge entire session for the user)Workers (workers-token-guarded, called only by Nest):
POST /ai/chat/stream body: { session_id, trip_id, user_jwt, history_cutoff, attachments? } response: text/event-stream (delta | done | error)Mobile (Flutter):
TripChatRepository→TripChatRemoteSource(Dio) +TripChatStreamClient(SSE;flutter_client_sseor hand-rolled overhttp.Client.send).TripChatCubitstates:Initial | Loading | Ready(messages, sending?, streamingId?) | Failure.
9. Non-functional requirements
Section titled “9. Non-functional requirements”| Concern | Target |
|---|---|
| Time to first token | ≤1.5 s p95 (staging LiteLLM → Haiku); ≤3.0 s p95 with image |
| Full turn latency | ≤6 s p95 text-only at ~300 output tokens |
| Cost | ≤$0.02/message average; ≤$0.50/session/day hard cap |
| Offline | Tab shows an empty/offline state with composer disabled and a “reconnect” hint; no local Drift cache in v0 (see Q3 / R-deferred) |
| Privacy | RLS enforced via user JWT for context snapshot read; service role only used for storage download + history writes |
| Observability | Structlog events chat.turn.started/streamed/finished/failed with session_id, trip_id, tokens, cost, model |
| Accessibility | Screen-reader announces assistant reply once (on done), not per delta |
| Failure isolation | A failing LiteLLM call must persist a failed row + error string; never leave a streaming row to rot |
10. Risks & open questions
Section titled “10. Risks & open questions”- Q1: SSE through Nest vs direct mobile→workers? Going through Nest preserves the single auth boundary and matches every other feature, at the cost of one extra hop. Decision pending; if latency budget is missed, fold into ADR 0009.
- Q2: Do we need a true streaming chat method on
LiteLLMClient? Current client is instructor-only (structured). Likely add a parallelchat_stream(...) -> AsyncIterator[str]next tocomplete()rather than refactoring instructor. - Q3: Local Drift schema for cached messages — resolved: v0 skips local caching. History is fetched from Supabase on tab open; offline = read-only empty state. Revisit if users complain.
- R1: Token blow-out on long trips. Mitigation: snapshot cap + last N (≤20) message rolling window; older messages summarised by a cheap pre-pass only when the window is full (defer to v0.1 if not needed).
- R2: Prompt injection from imported reels’ captions appearing in trip data. Mitigation: snapshot renders trip data inside a clearly fenced block with an instruction to ignore directives within it; add a regression eval set (≥10 prompts) before shipping.
- R3: Per-user daily cap is global per session; abusive users
could create new trips to multiply quota. Track usage by
user_idtoo (column onuser_prefsor a separateai_usage_dailytable) — add in a follow-up if abuse appears. - R4: Cancellation race — client retries while workers still
streaming the previous reply. Mitigation: assistant message id is
minted up-front (returned in
ack) and used as an idempotency key; second POST with the same in-flight assistant id is rejected.
11. Rollout plan
Section titled “11. Rollout plan”- Migration 0020 — tables + RLS, no behaviour change. Ship independent of clients.
- Workers —
/ai/chat/streamendpoint behind a feature envCHAT_ENABLED=true. Smoke-tested withhttpx --streamand a golden trip fixture. - Backend —
/v1/chat/...endpoints, SSE relay, sign-upload. Integration tests cover AC-2, AC-4, AC-9, AC-10 with a stubbed workers fake. - Mobile — replace
AiChatTabplaceholder with the real widget tree behind a remotechat_enabledflag (or local debug toggle if no flag service yet). Internal dogfood first. - GA — flip the flag globally; AppBar / tab icon unchanged.
No data backfill needed (history begins on launch). Rollback = flip flag off + keep tables (no destructive change).
12. References
Section titled “12. References”- Placeholder being replaced: apps/mobile/lib/features/trips/view/ai_chat_tab.dart
- LiteLLM client to extend: apps/workers/src/treeper_workers/ai/llm.py
- Worker config (LiteLLM env): apps/workers/src/treeper_workers/config.py
- Reel imports (source of trip context that this assistant grounds on): specs/features/0008-reel-video-imports.md
- Notifications outbox + RLS pattern reused for session writes: specs/features/0011-notifications.md
- ADR on Supabase + Nest split: specs/adr/0002-supabase-with-nestjs-api.md
- ADR on Python workers boundary: specs/adr/0003-python-workers-for-ai-and-scraping.md