Skip to content

Feature: Saved Content Library

ID: 0014
Status: Draft
Owner: @satya
Created: 2026-06-17
Updated: 2026-06-17
Related ADRs: 0003 (Python workers), 0004 (monorepo)
Depends on: 0001 (trips/activities), 0003 (imports pipeline), 0008 (reels)

Instagram saved posts are a mess: restaurant recommendations sit next to multi-day trip guides, aesthetic Reels, and travel blog links — all mixed together. Today there is no way to bring this collection into treeper without creating noise in an actual trip (the previous approach of bulk-importing all posts as activities).

The fix: a standalone Saved Content Library where Instagram exports land, get auto-classified by kind, and sit until the user decides to pull them into a trip. Each kind gets the right “add to trip” action — a place becomes an activity, an itinerary triggers the existing import pipeline, a Reel becomes a planning item, an article becomes a reference link.


From PRODUCT.md §2:

  • Solo planner (P1) — primary. Saves 50+ posts while researching a trip; needs to find restaurant recs quickly without wading through unrelated guides.
  • Inspiration hoarder (P3) — primary. Collects anything travel-related; wants to surface “itineraries I can use” vs. “places I want to eat at” vs. “videos to rewatch for vibes”.

Out of scope for this spec:

  • Trip-mate (P4) — no sharing of saved content across collaborators.
  • Curated planner (P5) — no editorial curation or recommendations from saved library.

  • F0014.1.a Accept an Instagram data export ZIP file uploaded by the user.
  • F0014.1.b Sign a Supabase storage URL for the client to PUT the ZIP.
  • F0014.1.c Trigger the worker to classify each saved post into one of the 4 kinds.
  • F0014.1.d Return a count of classified items; surface errors (corrupt ZIP, no saved posts) as user-readable messages.
  • F0014.2.a Tier 1: keyword heuristic classifies each post in <1ms at zero cost. Priority order: reel (has video) → itinerary (day-plan keywords) → place (POI keywords) → link_article (caption URL) → ambiguous.
  • F0014.2.b Tier 2: LLM call for ambiguous posts only (uses existing LLMClient + instructor). Returns place | itinerary | reel | link_article.
  • F0014.2.c ai_confidence and ai_model are stored per item for telemetry.
  • F0014.2.d User can override the kind at any time (PATCH /v1/saved-items/:id).
  • F0014.3.a Dedicated “Saved” tab (web + mobile), separate from Trips.
  • F0014.3.b Filter bar: All / Places / Itineraries / Reels / Links.
  • F0014.3.c Card per item showing: kind chip, title, thumbnail (if photo/video), source date, and quick-action button (“Add to trip”).
  • F0014.3.d Archived items hidden from default view; accessible via toggle.
  • F0014.4.a place → creates an activity (kind=sight) in the selected trip/day.
  • F0014.4.b itinerary → triggers the existing import pipeline on the post URL (or caption text if no URL), opening the draft review screen.
  • F0014.4.c reel → creates a planning_item (kind=reel) in the trip.
  • F0014.4.d link_article → creates a planning_item (kind=link) in the trip.
  • F0014.4.e After “add to trip”, the item is not deleted from the library (it may be added to multiple trips).
  • F0014.5.a Archive an item (hidden from default view, not deleted).
  • F0014.5.b Delete an item permanently.
  • F0014.5.c Edit title, tags, and notes on any item.
  • Cross-platform imports (TikTok, Google Maps Saved, Pinterest) — separate spec.
  • Deduplication across multiple Instagram imports (v2).
  • Sharing saved items with trip collaborators (v2).
  • Ranking or quality scoring of saved items.
  • Automatic syncing (requires Instagram API approval; deferred indefinitely).
  • Converting a saved place into a new trip destination (beyond creating one activity).

  • As P1, when I upload my Instagram ZIP, I see my saved posts auto-sorted into Places / Itineraries / Reels / Links, so I can immediately filter to just restaurant recs for my trip destination.

  • As P1, when I tap “Add to trip” on a saved place, I can pick a day and it lands as an activity on that day’s plan, so I don’t have to manually recreate it.

  • As P3, when I tap “Add to trip” on a saved itinerary, the existing import pipeline opens, extracts the activities from the post, and I review them before committing, so the structured plan ends up in my trip.

  • As P1, when the AI mis-classifies a Reel as a Place, I can tap the kind chip and change it to Reel in one tap, so my library stays clean.


[Saved] [Import from Instagram] (+)
[All] [Places ●] [Itineraries] [Reels] [Links]
┌─────────────────────────────────────────────┐
│ 🏠 Place • 15 Jul 2026 │
│ Best café in Ubud — the one with rice │
│ field view... │
│ [thumbnail] [Add to trip ▸]│
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ 🗺️ Itinerary • 16 Jul 2026 │
│ 5-day Bali itinerary — day 1: arrive... │
│ [Import ▸] │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ 🎬 Reel • 17 Jul 2026 │
│ [video thumbnail] │
│ Watch this sunset reel [Add ▸]│
└─────────────────────────────────────────────┘

Tapping the kind chip on any card opens a bottom sheet:

Change kind:
○ Place
○ Itinerary
● Reel ← current
○ Link / Article
Add to which trip?
○ Japan 2026 (current)
○ Bali 2026
○ New trip...
Add to day:
○ Day 1 (15 Jul)
● Day 3 (17 Jul) ← suggested from source_date
○ Unscheduled

AC-1 F0014.1.a Given a valid Instagram ZIP with saved_posts.json, POST /v1/saved-items/import-instagram returns { items_created: N }.
AC-2 F0014.1.d Given a corrupt ZIP, the endpoint returns 422 with error "Invalid Instagram export format".
AC-3 F0014.2.a A post with video_list present is classified as "reel" without an LLM call.
AC-4 F0014.2.a A post with caption "5-day Bali itinerary day 1:" is classified as "itinerary" without an LLM call.
AC-5 F0014.2.a A post with caption "Best restaurant in Ubud!" is classified as "place" without an LLM call.
AC-6 F0014.2.b A post with empty caption and no video triggers an LLM classification call.
AC-7 F0014.2.d PATCH /v1/saved-items/:id with { kind: "reel" } updates the item and returns 200.
AC-8 F0014.3.b GET /v1/saved-items?kind=place returns only items with kind="place".
AC-9 F0014.4.a POST /v1/saved-items/:id/add-to-trip with a "place" item creates an activity in the target trip with kind="sight".
AC-10 F0014.4.b POST /v1/saved-items/:id/add-to-trip with an "itinerary" item creates an import with status="queued" in the target trip.
AC-11 F0014.4.c POST /v1/saved-items/:id/add-to-trip with a "reel" item creates a planning_item with kind="reel" in the target trip.
AC-12 F0014.4.d POST /v1/saved-items/:id/add-to-trip with a "link_article" item creates a planning_item with kind="link" in the target trip.
AC-13 F0014.5.a PATCH /v1/saved-items/:id with { archived: true } hides the item from GET /v1/saved-items (default).
AC-14 F0014.5.b DELETE /v1/saved-items/:id removes the item; subsequent GET returns 404.

-- New enum
saved_item_kind: 'place' | 'itinerary' | 'reel' | 'link_article'
-- New table: saved_items
id uuid PK
user_id uuid → auth.users
kind saved_item_kind NOT NULL
title text (1200)
notes text (14000) -- full caption
url text (12048) -- primary link extracted from caption
image_urls text[] default '{}'
video_url text (12048)
ai_confidence numeric(3,2) -- 0.0–1.0; null if keyword-classified
ai_model text -- which LLM was used; null if keyword
source text default 'instagram_export'
source_date date -- taken_at from the post
tags text[] default '{}'
archived boolean default false
created_at timestamptz
updated_at timestamptz
-- Constraint: at least one content field present
CHECK (title IS NOT NULL OR notes IS NOT NULL OR url IS NOT NULL)
-- Indexes
(user_id, kind, created_at DESC)
(user_id, created_at DESC)
(user_id, source_date DESC)
-- RLS: owner-scoped (like liked_itineraries)

No foreign keys to other entity tables — items are self-contained captures. Cross-linking saved places to trip_destinations or itineraries is deferred to v2.


POST /v1/saved-items/sign-upload
→ { url: string, path: string } // client PUTs ZIP directly to Supabase
POST /v1/saved-items/import-instagram
body: { storage_path, filename, bytes? }
→ { items_created: number, kinds: { place, itinerary, reel, link_article } }
GET /v1/saved-items
query: kind?, archived?, limit?, cursor?
→ { items: SavedItem[], next_cursor? }
PATCH /v1/saved-items/:id
body: { kind?, title?, notes?, tags?, archived? }
→ SavedItem
DELETE /v1/saved-items/:id → 204
POST /v1/saved-items/:id/add-to-trip
body: { trip_id, day_id? }
→ { result_type: 'activity'|'planning_item'|'import', result_id: string }

Worker (internal, X-Workers-Token auth):

POST /ai/saved-items/classify-instagram
body: { zip_storage_path: string, user_id: string }
→ { items: SavedItemCreate[], model_used: string, tokens_in, tokens_out, cost_usd }

8a. Reel → multi-place extraction (extension)

Section titled “8a. Reel → multi-place extraction (extension)”

A pasted reel/video URL (Instagram / TikTok / YouTube) is not classified as a single reel. Instead it runs the same rich extraction as trip imports — yt-dlp download → Gemini watches the video → ItineraryDraft with many activities → the region-anchored geocoder — and fans out one saved_item per place (café, beach, sight…), each with its own location_lat/lng/address. The source reel is also kept as one reel item. Every produced row carries source_url = the reel permalink (items are flat in the library, linked only by that URL). Reuses the global reel_extractions cache, so a reel already extracted for a trip is free here.

Because a Gemini video call is slow, this path is async via a job row the client polls. Plain (non-video) URLs keep the synchronous import-url OG-scrape path.

New columns on saved_items: location_lat, location_lng, location_address, source_url. New table saved_item_jobs(id, user_id, source_url, status [queued|extracting|ready|failed], items_created, error, …) (RLS owner-scoped, Realtime-published) — migration 0032.

POST /v1/saved-items/import-reel
body: { url } // 400 if not an IG/TikTok/YouTube URL
→ { job_id: string, status: 'queued' }
GET /v1/saved-items/jobs/:id
→ { id, status, items_created, error } // client polls until ready|failed

Worker (internal):

POST /ai/saved-items/import-reel
body: { url, user_id, job_id }
→ 202 Accepted; runs extraction in the background, writes saved_items +
updates saved_item_jobs (service-role).

Kind mapping for fanned-out activities: skip transport and non-stops; run the keyword sub-classifier on the title for a fine-grained place kind, else map food→restaurant, lodging→accommodation, sight/freeform→attraction.


AspectTarget
Import latency≤ 45s for 100 posts (synchronous; keyword path fast, LLM batched)
LLM cost≤ $0.05 per import of 100 posts (keyword handles ≥80%; LLM only for ambiguous)
Classification accuracy≥ 80% correct kind without user override (measured post-ship on sample)
Offline accessSaved library loads from local cache when offline; add-to-trip requires connection
PrivacyZIP is stored in user-private path, deleted after classification

  • R1 — Caption-only posts (no URL, no POI name) may all fall to LLM, raising cost. Mitigation: Default ambiguous posts to place (most common save type) rather than calling LLM; accuracy will be lower but cost is bounded.

  • R2 — Instagram export format changes without notice (Meta has done this). Mitigation: ig_parser.py is isolated; format changes only break the parser, not the rest of the pipeline. Monitor for errors in saved_item_import_jobs.

  • Q1 — Should a link_article with no URL just store the caption? Or is an empty URL a validation error? Resolved: store caption as notes; URL is optional.

  • Q2 — What happens if the user imports twice (duplicate posts)? Deferred to v2 via source_date + title dedup. For now, duplicate items are allowed.


  1. Worker + classification ships first (no UI dependency).
  2. Backend API ships with worker (can be verified with curl/Postman).
  3. Web library screen — second slice (shows library, filter, add-to-trip).
  4. Mobile library screen — third slice (after web is stable).
  5. No feature flag needed — new endpoints/screen; no impact on existing flows.

12. Extension — reel → multi-place extraction

Section titled “12. Extension — reel → multi-place extraction”

The base flow assigns ONE kind per saved post from its caption/OG text. This extension runs the same rich extraction the trip importer uses (yt-dlp downloads the video → Gemini watches it → ItineraryDraft with many activities → region-anchored geocode) and fans out one saved_item per location, plus keeps the source reel itself. Items are flat in the library; each carries a source_url reference back to the reel.

Data model (migration 0032_saved_items_locations_jobs.sql):

  • saved_items gains location_lat, location_lng, location_address, source_url.
  • saved_item_jobs (id, user_id, source_url, status, items_created, error, …) tracks the async run: queued → extracting → ready | failed. Realtime-published so clients can watch it; clients may also poll.

Flow:

POST /v1/saved-items/import-reel { url } ← only IG/TikTok/YouTube hosts
→ insert saved_item_jobs (queued)
→ worker POST /ai/saved-items/import-reel { url, user_id, job_id } (202, bg task)
extract_reel_draft(url) ← shared cache-aware helper (ai/reel_extract.py)
cache HIT → reuse reel_extractions payload (no fetch/LLM)
cache MISS → yt-dlp fetch → Gemini → persist extraction + asset
DEDUP: if user already has reel_extract rows for this source_url → stop (no dupes)
geocode (region-anchored) + media (per-place photos), best-effort
fan out: 1 reel item + 1 item per qualifying activity → bulk insert
job → ready (items_created = N)
GET /v1/saved-items/jobs/:id ← client polls until ready/failed

Kind mapping (_activity_to_saved_kind): skip transport and is_stop=false; keyword_classify the title for a fine-grained place sub-type; else fall back food→restaurant, lodging→accommodation, sight/freeform→attraction.

Caching & dedup:

  • Extraction is cached globally in reel_extractions (keyed by canonical URL hash, shared with trip imports) — re-paste never re-runs Gemini.
  • Fan-out is deduped per (user_id, source='reel_extract', source_url) — a re-paste finishes ready with the existing count and inserts nothing.

Provider note: video uses the Gemini Files API directly (GeminiClient), NOT LiteLLM/OpenRouter — LiteLLM’s chat-completions interface can’t do the upload-and-poll Files API flow for a ≥20MB reel. Requires GEMINI_API_KEY (AI Studio) or the Vertex backend; MODEL_VIDEO=gemini-2.5-flash keeps cost low. Instagram fetch reliability from a datacenter IP needs REEL_COOKIES_FILE.

  • Q2 (revisited) — duplicate imports of the same reel are now deduped by source_url. Cross-source duplicates (same place from two different reels) are still allowed by design.