0003 — Trip Imports (PDF / Image / Blog → Itinerary)
Status: Shipped (v0 ingest pipeline live end-to-end) Owners: @satya Updated: 2026-05-12 Depends on: 0001 trip-planner (trips, activities, attachments bucket)
Live surfaces: Python worker pipeline (
apps/workers/), NestJS imports module (apps/backend/src/imports/), mobile import sheets + review screen (apps/mobile/lib/features/trips/). See ../progress.md for the milestone log.
1. Problem
Section titled “1. Problem”Users plan trips by collecting fragments: a PDF voucher from a tour operator, a screenshot of a friend’s reel, a brochure photo, a travel blog with a day-by-day breakdown. Re-typing those into the trip itinerary is the single biggest friction point post-M8.
2. Goal
Section titled “2. Goal”Let a user attach any of {PDF, image, blog URL} to a trip and get a
draft itinerary back within ~30s that they can review, edit, and
selectively commit into the trip’s activities table.
3. Non-goals (this milestone)
Section titled “3. Non-goals (this milestone)”- YouTube transcripts / Reels / TikTok (M10 — needs scraper + Whisper).
- Audio note imports.
- Multi-trip batch imports.
- Cross-user sharing of parsed drafts.
4. Architecture
Section titled “4. Architecture”flowchart LR Flutter[["Flutter<br/>mobile app"]] Nest[["NestJS backend<br/>(apps/backend)"]] Worker[["Python workers<br/>(FastAPI · apps/workers)"]] Parser["parser router<br/>(pdf · image · url)"] LLM["LLM client<br/>(OpenRouter + instructor)"] Drafts[("public.import_drafts<br/>service-role writes")] Realtime{{"Supabase Realtime<br/>postgres_changes"}}
Flutter -- "POST /trips/:id/imports" --> Nest Nest -- "insert imports row" --> Realtime Realtime -- "status='queued'" --> Worker Worker --> Parser --> LLM --> Drafts Drafts -- "row update" --> Realtime Realtime -- "subscription" --> Flutter Flutter -- "commit → POST activities" --> NestWhy workers in Python: mature pypdf / pdf2image / trafilatura /
instructor ecosystem. apps/workers already exists.
Why Realtime over polling: mobile already uses Supabase Realtime for
M7 sync chip; reuses the same channel infra. Worker subscribes via
postgres_changes filtered on status = 'queued'.
5. LLM strategy
Section titled “5. LLM strategy”- Gateway: OpenRouter. One key, OpenAI-compatible API, fallback chain.
- SDK layer:
openaipython client +instructorfor Pydantic-typed structured outputs across any backing model. - Abstraction:
LLMClientprotocol intreeper_workers/ai/llm/client.py—OpenRouterClientis the default; future direct providers can drop in. - Model routing (env-overridable):
- Vision (image / scanned PDF):
anthropic/claude-sonnet-4.5 - Text structuring:
anthropic/claude-haiku-4.5 - Fallbacks:
google/gemini-2.5-pro,google/gemini-2.5-flash
- Vision (image / scanned PDF):
6. Parsing pipelines
Section titled “6. Parsing pipelines”| Source | Step 1 | Step 2 | Step 3 |
|---|---|---|---|
| PDF (text) | pypdf extract | LLM structuring (Haiku) | merge |
| PDF (scanned) | pdf2image → PNG | Vision LLM per page (batched 5) | merge + page confidences |
| Image | (skip OCR) | Vision LLM | — |
| Blog URL | trafilatura extract; playwright only if <500 chars | LLM structuring | cache by URL hash 30d |
PDF text-vs-vision split: if avg_chars_per_page < 100, route to vision.
7. Data model
Section titled “7. Data model”See infra/supabase/migrations/0007_imports.sql.
imports— one per upload; status machinequeued → parsing → ready → committed|discarded, orfailed.import_drafts— one per successful parse; JSONB payload + page confidences.activities.source_import_id/source_snippet— provenance on committed rows.
Storage reuses trip-attachments bucket with key prefix
<trip_id>/imports/<import_id>/<filename>. Existing trip-scoped storage
RLS applies unchanged.
8. API (NestJS — to be implemented in next slice)
Section titled “8. API (NestJS — to be implemented in next slice)”POST /trips/:id/imports body: { source_type, source_uri, filename?, bytes? } → 202 { id }GET /imports/:id → { status, draft?, error? }POST /imports/:id/commit body: { activity_indexes: int[] } → { committed: Activity[] }DELETE /imports/:id → { ok: true } (sets status=discarded)Mobile gets a signed PUT URL from the existing attachments service to
upload bytes, then calls POST /trips/:id/imports with the storage path.
9. Worker contract
Section titled “9. Worker contract”Worker exposes (already mounted, currently stubs):
POST /ai/imports/start body: { import_id } → 202 { accepted: true }In normal flow the worker discovers jobs via Realtime; the explicit endpoint exists for retries / dev triggers. All worker→DB writes use the service-role key.
10. Pydantic schema (worker)
Section titled “10. Pydantic schema (worker)”class ImportedActivity(BaseModel): day_index: int | None date: date | None time: time | None title: str location: str | None notes: str | None source_snippet: str
class PageConfidence(BaseModel): page: int confidence: Literal['high','medium','low']
class ItineraryDraft(BaseModel): destinations: list[str] activities: list[ImportedActivity] page_confidences: list[PageConfidence] = [] overall_confidence: Literal['high','medium','low']Undated activities anchor day_index=0 → trip start_date; user edits
in review screen.
11. Guardrails
Section titled “11. Guardrails”- File size cap: 10MB (signed-URL constraint + worker re-check).
- Page cap: 20 pages per PDF.
- URL:
http(s)only; SSRF guard rejects private IPs / metadata endpoints. - Rate limit: 20 imports/user/day (DB count check in NestJS).
- Cost ceiling: $0.50/import — worker tracks running tokens × model price
and aborts with
failedif exceeded. - LLM keys env-only on worker. Backend never sees them.
- Auto-delete imports storage after 90 days (cron, M10).
12. Implementation order
Section titled “12. Implementation order”- Migration
0007_imports.sql - Spec
- Worker: LLM client (OpenRouter + instructor) + ItineraryDraft schema
- Worker: blog parser end-to-end (no upload, easy to validate)
- Worker: PDF text + vision pipeline
- Worker: image pipeline
- Worker: Realtime subscriber main loop
- Backend:
importsmodule (controller, service, DTOs) - Mobile: import sheet on trip detail
- Mobile: review screen + commit
- Guardrails (rate limit, SSRF, cost ceiling)
- E2E tests (one fixture per source)
Each step is independently shippable behind an imports_enabled feature
flag on the user row (added later if needed).
13. Open / deferred
Section titled “13. Open / deferred”- Pricing data for cost ceiling: hard-code per-model $/Mtok in worker config; revisit when OpenRouter publishes a usage endpoint.
- Streaming partial results from worker for very long PDFs (M10).
- Hand-off to “Reels” path: now its own spec — see
0008 Reel / Video Imports. Uses
source_type='video'(enum widened in migration 0014) and a Gemini 2.5 multimodal parser.