Instagram Saved Posts Export Feature
Implementation Complete ✅
Section titled “Implementation Complete ✅”The Instagram saved posts export feature has been implemented to allow users to extract their saved Instagram content and import it into treeper trips.
What Was Implemented
Section titled “What Was Implemented”Backend (TypeScript)
Section titled “Backend (TypeScript)”apps/backend/src/imports/dto/create-import.dto.ts- Added
'instagram'toImportSourceTypeenum - Updated validators to accept Instagram exports
- Added
Workers (Python)
Section titled “Workers (Python)”-
apps/workers/src/treeper_workers/ai/ig_parser.py(NEW)- Parses Instagram export ZIPs containing
saved_posts.json - Converts saved posts into activities with:
- Title from caption (truncated to 200 chars)
- Notes with full caption
- Photos from static posts or carousels
- Video URLs from reels
- Dates extracted from post timestamps
- Handles edge cases: missing captions, corrupt files, empty exports
- Parses Instagram export ZIPs containing
-
apps/workers/src/treeper_workers/ai/pipeline.py- Integrated Instagram parser into source dispatcher
- Maps
source_type='instagram'to ZIP parsing
apps/workers/tests/test_ig_parser.py(NEW)- 16 unit tests covering:
- Empty and single posts
- Photos, carousels, and reels
- Date parsing (multiple ISO formats)
- Caption truncation and snippet generation
- Error handling (corrupt ZIPs, missing files)
- Edge cases (no captions, non-dict entries)
- 16 unit tests covering:
How It Works
Section titled “How It Works”User Flow
Section titled “User Flow”- User exports Instagram data: Settings → Security → Download Data
- Instagram emails ZIP after 24-48 hours
- User uploads ZIP via treeper web UI:
POST /v1/trips/:tripId/imports/sign-upload→ get signed URL- Upload ZIP to Supabase storage
- Backend creates import:
POST /v1/trips/:tripId/importssource_type: 'instagram'source_uri: 's3://bucket/trip-attachments/_instagram/:tripId/:importId.zip'
- Worker parses ZIP → creates
ItineraryDraft - User reviews draft, selects activities:
POST /imports/:id/commit - Activities stored in trip with dates, captions, and media URLs
Data Extraction
Section titled “Data Extraction”Instagram Export Format:
[ { "caption": "Beautiful sunset...", "taken_at": "2026-07-15T06:00:00", "photo": [{"uri": "https://instagram.com/photo.jpg"}], "carousel": [...], "video_list": [{"uri": "https://video.mp4"}] }]Parsed Activity:
title: Caption (max 200 chars) or “Saved Post #N”kind: ‘sight’ (can be reclassified later)date: Extracted fromtaken_attimestampnotes: Full caption textimage_urls: Photos from postvideo_url: Reel video linklocation: null (user can add later)
Enrichment & Commit
Section titled “Enrichment & Commit”Activities are created with:
- ✅ Titles and notes from captions
- ✅ Media URLs from Instagram export
- ❌ Coordinates (null — user adds locations later or uses reenrich)
- ❌ Media enrichment (already have URLs from export)
User can later:
- Edit titles, add locations
- Trigger
POST /v1/trips/:tripId/reenrichto geocode locations - Adjust activity kinds via the UI
API Contract
Section titled “API Contract”// Create Instagram ImportPOST /v1/trips/:tripId/imports{ "sources": [{ "source_type": "instagram", "source_uri": "s3://bucket/trip-attachments/_instagram/:tripId/:importId.zip", "source_filename": "instagram_export.zip", "source_bytes": 15000000, "source_hash": "sha256..." }], "user_context": "My saved posts for Japan trip"}
// Response: Import queued, worker parses ZIP → returns ItineraryDraftGET /v1/imports/:id→ status: 'parsing' | 'ready' | 'failed'→ payload: { destinations: [], activities: [...] }
// Commit: User selects activitiesPOST /v1/imports/:id/commit{ "selected_activity_indexes": [0, 1, 3]}→ Activities created in trip with dates, notes, linksFile Structure
Section titled “File Structure”apps/├── backend/src/imports/dto/│ └── create-import.dto.ts ← Updated with 'instagram'└── workers/src/treeper_workers/ ├── ai/ │ ├── ig_parser.py ← NEW │ └── pipeline.py ← Updated └── tests/ └── test_ig_parser.py ← NEW (16 tests, all passing)Verification
Section titled “Verification”All 173 worker tests pass including:
- 16 new Instagram parser unit tests
- All existing import pipeline tests
- Full integration with enrichment system
Backend TypeScript compiles without errors.
Edge Cases Handled
Section titled “Edge Cases Handled”| Case | Behavior |
|---|---|
| Empty saved posts | Returns empty draft (user can discard import) |
| Missing caption | Default title: “Saved Post #1”, #2, etc. |
| Corrupt ZIP | Error: “Invalid ZIP file” |
| Invalid JSON | Error: JSON parse failure |
| No coordinates | Activities created with location_lat=null |
| Very large ZIP | Supported (Supabase limit 50GB) |
| Duplicate imports | Creates separate sets (user can delete via UI) |
| Enrichment failures | Graceful degradation (activities created without geo) |
Future Enhancements
Section titled “Future Enhancements”- Incremental imports: Track already-imported posts to avoid duplicates
- Geocoding from captions: Auto-extract location hints from caption text
- Kind classification: Use LLM to guess activity types from captions
- Realtime sync: Webhook for new saved posts (would need API approval)
Testing Instructions
Section titled “Testing Instructions”Run Instagram parser tests:
cd apps/workersuv run pytest tests/test_ig_parser.py -vRun full test suite:
uv run pytest tests/ -qIntegration test (local):
-
Prepare test ZIP:
Terminal window mkdir -p ig_export && cat > ig_export/saved_posts.json <<'EOF'[{"caption": "Test post https://example.com", "taken_at": "2026-07-15T06:00:00"}]EOFzip -r test.zip ig_export -
Upload via local API:
Terminal window # Sign upload URLcurl -X POST http://localhost:3000/v1/trips/:tripId/imports/sign-upload \-H "Authorization: Bearer $JWT" \-d '{"filename":"ig_export.zip","bytes":1000}' | jq .url# Upload ZIPcurl -X PUT "$SIGNED_URL" --data-binary @test.zip# Create importcurl -X POST http://localhost:3000/v1/trips/:tripId/imports \-H "Authorization: Bearer $JWT" \-d '{"sources":[{"source_type":"instagram","source_uri":"...zip"}]}' -
Verify in Supabase:
SELECT status, error FROM imports WHERE id = '...';SELECT payload FROM import_drafts WHERE import_id = '...';