Skip to content

Instagram Saved Posts Export Feature

The Instagram saved posts export feature has been implemented to allow users to extract their saved Instagram content and import it into treeper trips.

  • apps/backend/src/imports/dto/create-import.dto.ts
    • Added 'instagram' to ImportSourceType enum
    • Updated validators to accept Instagram exports
  • apps/workers/src/treeper_workers/ai/ig_parser.py (NEW)

    • Parses Instagram export ZIPs containing saved_posts.json
    • Converts saved posts into activities with:
      • Title from caption (truncated to 200 chars)
      • Notes with full caption
      • Photos from static posts or carousels
      • Video URLs from reels
      • Dates extracted from post timestamps
    • Handles edge cases: missing captions, corrupt files, empty exports
  • apps/workers/src/treeper_workers/ai/pipeline.py

    • Integrated Instagram parser into source dispatcher
    • Maps source_type='instagram' to ZIP parsing
  • apps/workers/tests/test_ig_parser.py (NEW)
    • 16 unit tests covering:
      • Empty and single posts
      • Photos, carousels, and reels
      • Date parsing (multiple ISO formats)
      • Caption truncation and snippet generation
      • Error handling (corrupt ZIPs, missing files)
      • Edge cases (no captions, non-dict entries)
  1. User exports Instagram data: Settings → Security → Download Data
  2. Instagram emails ZIP after 24-48 hours
  3. User uploads ZIP via treeper web UI:
    • POST /v1/trips/:tripId/imports/sign-upload → get signed URL
    • Upload ZIP to Supabase storage
  4. Backend creates import: POST /v1/trips/:tripId/imports
    • source_type: 'instagram'
    • source_uri: 's3://bucket/trip-attachments/_instagram/:tripId/:importId.zip'
  5. Worker parses ZIP → creates ItineraryDraft
  6. User reviews draft, selects activities: POST /imports/:id/commit
  7. Activities stored in trip with dates, captions, and media URLs

Instagram Export Format:

[
{
"caption": "Beautiful sunset...",
"taken_at": "2026-07-15T06:00:00",
"photo": [{"uri": "https://instagram.com/photo.jpg"}],
"carousel": [...],
"video_list": [{"uri": "https://video.mp4"}]
}
]

Parsed Activity:

  • title: Caption (max 200 chars) or “Saved Post #N”
  • kind: ‘sight’ (can be reclassified later)
  • date: Extracted from taken_at timestamp
  • notes: Full caption text
  • image_urls: Photos from post
  • video_url: Reel video link
  • location: null (user can add later)

Activities are created with:

  • ✅ Titles and notes from captions
  • ✅ Media URLs from Instagram export
  • ❌ Coordinates (null — user adds locations later or uses reenrich)
  • ❌ Media enrichment (already have URLs from export)

User can later:

  • Edit titles, add locations
  • Trigger POST /v1/trips/:tripId/reenrich to geocode locations
  • Adjust activity kinds via the UI
// Create Instagram Import
POST /v1/trips/:tripId/imports
{
"sources": [{
"source_type": "instagram",
"source_uri": "s3://bucket/trip-attachments/_instagram/:tripId/:importId.zip",
"source_filename": "instagram_export.zip",
"source_bytes": 15000000,
"source_hash": "sha256..."
}],
"user_context": "My saved posts for Japan trip"
}
// Response: Import queued, worker parses ZIP → returns ItineraryDraft
GET /v1/imports/:id
→ status: 'parsing' | 'ready' | 'failed'
→ payload: { destinations: [], activities: [...] }
// Commit: User selects activities
POST /v1/imports/:id/commit
{
"selected_activity_indexes": [0, 1, 3]
}
→ Activities created in trip with dates, notes, links
apps/
├── backend/src/imports/dto/
│ └── create-import.dto.ts ← Updated with 'instagram'
└── workers/src/treeper_workers/
├── ai/
│ ├── ig_parser.py ← NEW
│ └── pipeline.py ← Updated
└── tests/
└── test_ig_parser.py ← NEW (16 tests, all passing)

All 173 worker tests pass including:

  • 16 new Instagram parser unit tests
  • All existing import pipeline tests
  • Full integration with enrichment system

Backend TypeScript compiles without errors.

CaseBehavior
Empty saved postsReturns empty draft (user can discard import)
Missing captionDefault title: “Saved Post #1”, #2, etc.
Corrupt ZIPError: “Invalid ZIP file”
Invalid JSONError: JSON parse failure
No coordinatesActivities created with location_lat=null
Very large ZIPSupported (Supabase limit 50GB)
Duplicate importsCreates separate sets (user can delete via UI)
Enrichment failuresGraceful degradation (activities created without geo)
  • Incremental imports: Track already-imported posts to avoid duplicates
  • Geocoding from captions: Auto-extract location hints from caption text
  • Kind classification: Use LLM to guess activity types from captions
  • Realtime sync: Webhook for new saved posts (would need API approval)

Run Instagram parser tests:

Terminal window
cd apps/workers
uv run pytest tests/test_ig_parser.py -v

Run full test suite:

Terminal window
uv run pytest tests/ -q

Integration test (local):

  1. Prepare test ZIP:

    Terminal window
    mkdir -p ig_export && cat > ig_export/saved_posts.json <<'EOF'
    [{"caption": "Test post https://example.com", "taken_at": "2026-07-15T06:00:00"}]
    EOF
    zip -r test.zip ig_export
  2. Upload via local API:

    Terminal window
    # Sign upload URL
    curl -X POST http://localhost:3000/v1/trips/:tripId/imports/sign-upload \
    -H "Authorization: Bearer $JWT" \
    -d '{"filename":"ig_export.zip","bytes":1000}' | jq .url
    # Upload ZIP
    curl -X PUT "$SIGNED_URL" --data-binary @test.zip
    # Create import
    curl -X POST http://localhost:3000/v1/trips/:tripId/imports \
    -H "Authorization: Bearer $JWT" \
    -d '{"sources":[{"source_type":"instagram","source_uri":"...zip"}]}'
  3. Verify in Supabase:

    SELECT status, error FROM imports WHERE id = '...';
    SELECT payload FROM import_drafts WHERE import_id = '...';