clawdbites

Safe
Shopping & E-commerce

Extract recipes from Instagram reels.

SKILL.md

# Instagram Recipe Extractor Extract recipes from Instagram reels using a multi-layered approach: 1. **Caption parsing** β€” Instant, check description first 2. **Audio transcription** β€” Whisper (local, no API key) 3. **Frame analysis** β€” Vision model for on-screen text No Instagram login required. Works on public reels. ## When to Use - User sends an Instagram reel link - User mentions "recipe from Instagram" or "save this reel" - User wants to extract recipe details from a video post ## How It Works (MANDATORY FLOW) **ALWAYS follow this complete flow β€” do not stop after caption if instructions are missing:** 1. User sends Instagram reel URL 2. Extract metadata using yt-dlp (`--dump-json`) 3. Parse the caption for recipe details 4. **Check completeness:** Does caption have BOTH ingredients AND instructions? - βœ… **YES:** Present the recipe - ❌ **NO (missing instructions or incomplete):** **Automatically proceed to audio transcription** β€” do NOT stop or ask the user 5. If audio transcription needed: - Download video: `yt-dlp -o "/tmp/reel.mp4" "URL"` - Extract audio: `ffmpeg -y -i /tmp/reel.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 /tmp/reel.wav` - Transcribe: `whisper /tmp/reel.wav --model base --output_format txt --output_dir /tmp` - Merge caption ingredients with audio instructions 6. Present clean, formatted recipe (combining caption + audio as needed) 7. User decides what to do (save to notes, add to wishlist, etc.) **Completeness check heuristics:** - Has ingredients = contains 3+ quantity+item patterns (e.g., "1 cup flour", "2 lbs chicken") - Has instructions = contains action verbs (blend, cook, bake, mix, pour, add) + sequence OR numbered steps ## Extraction Command ```bash yt-dlp --dump-json "https://www.instagram.com/reel/SHORTCODE/" 2>/dev/null ``` **Key fields from JSON output:** - `description` β€” The caption containing the recipe - `uploader` β€” Creator's name - `channel` β€” Creator's handle - `webpage_url` β€” Original URL - `like_count` β€” Popularity indicator ## Recipe Parsing Look for these patterns in the caption: **Macros:** - "X Calories | Xg P | Xg C | Xg F" - "Macros per serving" - "Cal/Protein/Carbs/Fat" **Ingredients:** - Lines starting with quantities (1 cup, 2 tbsp, 24oz) - Lines with measurement units - Emoji bullet points (πŸ₯© 🌽 πŸ§€ etc.) **Sections:** - "For the [component]:" - "Ingredients:" - "Instructions:" - "Directions:" ## Output Format Present extracted recipe cleanly: ``` ## [Recipe Name] *From @[handle]* **Macros (per serving):** X cal | Xg P | Xg C | Xg F ### Ingredients - [ingredient 1] - [ingredient 2] ... ### Instructions 1. [step 1] 2. [step 2] ... --- Source: [original URL] ``` ## User Actions After Extraction Let the user decide what to do: - "Save to my recipes" β†’ Save to Apple Notes (if meal-planner skill available) - "Add to wishlist" β†’ Save to `memory/recipe-wishlist.json` - "Just show me" β†’ Display only, no save - "Plan this for next week" β†’ Hand off to meal-planner skill ## Wishlist Storage Optional storage for recipes user wants to try later: **memory/recipe-wishlist.json:** ```json { "recipes": [ { "name": "Recipe Name", "source": "instagram", "sourceUrl": "https://instagram.com/reel/...", "handle": "@creator", "addedDate": "2026-01-26", "tried": false, "macros": { "calories": 585, "protein": 56, "carbs": 25, "fat": 28, "servings": 3 }, "ingredients": [...], "instructions": [...] } ] } ``` ## Error Handling **If yt-dlp fails:** - Check if URL is valid Instagram reel format - May be a private account β€” inform user - Suggest user paste caption text manually as fallback **If no recipe found in caption (IMPORTANT):** After extracting, scan the caption for recipe indicators: - Ingredient quantities (numbers + units like oz, cups, tbsp, lbs) - Recipe sections ("For the...", "Ingredients:", "Instructions:") - Cooking verbs (bake, cook, sautΓ©, mix, combine) - Macro information (calories, protein, carbs, fat) **If none found, tell the user clearly:** > "I pulled the caption but it doesn't look like the recipe is there β€” it might just be a teaser or the recipe is only shown in the video itself. Here's what the caption says: > > [show caption] > > A few options: > 1. Check the comments β€” sometimes creators post recipes there > 2. Check their bio link β€” might lead to the full recipe > 3. Describe what you saw in the video and I can help find a similar recipe" **Recipe detection heuristics:** ``` HAS_RECIPE if caption contains: - 3+ ingredient-like patterns (quantity + food item) - OR "recipe" + ingredient list - OR macro breakdown + ingredients - OR numbered/bulleted instructions NO_RECIPE if caption is: - Mostly hashtags - Just a description/teaser - Under 100 characters - No quantities or measurements ``` ## Integration with meal-planner The meal-planner skill can reference this skill: - When planning meals, check wishlist for untried recipes - Suggest wishlist recipes that match pantry items - Mark recipes as "tried" after they're used in a meal plan ## Audio Transcription (V2) β€” MANDATORY FALLBACK **When caption is missing instructions, ALWAYS transcribe the audio automatically.** Do not stop and ask the user β€” just do it. This is the most common case since creators often put ingredients in captions but speak the instructions. **Step 1: Download video** ```bash yt-dlp -o "/tmp/reel.mp4" "https://instagram.com/reel/XXX" ``` **Step 2: Extract audio** ```bash ffmpeg -i /tmp/reel.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 /tmp/reel.wav ``` **Step 3: Transcribe with Whisper** ```bash /Users/kylekirkland/Library/Python/3.14/bin/whisper /tmp/reel.wav --model base --output_format txt --output_dir /tmp ``` **Step 4: Parse transcript for recipe** Look for cooking instructions, ingredients mentioned verbally. ## Inference for Missing Measurements **ALWAYS infer quantities when not provided.** Never present a recipe without amounts β€” estimate based on context and standard package sizes. ### Vague Language β†’ Specific Amounts | What they say | Infer | |--------------|-------| | "some chicken" | ~1 lb | | "a bit of garlic" | 2-3 cloves | | "handful of spinach" | ~2 cups | | "drizzle of oil" | 1-2 tbsp | | "season to taste" | Β½ tsp salt, ΒΌ tsp pepper | | "splash of soy sauce" | 1-2 tbsp | | "a few tablespoons" | 2-3 tbsp | | "some rice" | 1 cup dry | | "cheese on top" | Β½ - 1 cup shredded | | "diced onion" | 1 medium onion | | "bell peppers" | 2 peppers | ### Standard Package Sizes (when item mentioned without amount) | Ingredient | Standard Package | Infer | |------------|------------------|-------| | Puff pastry | 17oz sheet | 1 sheet | | Ground beef/turkey | 1 lb pack | 1 lb | | Chicken breast | ~1.5 lb pack | 1.5 lbs | | Sausage links | 14oz / 4-5 links | 1 package | | Bacon | 12oz / 12 slices | Β½ package (6 slices) | | Shredded cheese | 8oz bag | 1-2 cups | | Tortillas | 8-10 count | 1 package | | Canned beans | 15oz can | 1 can | | Broth/stock | 32oz carton | 1-2 cups | | Pasta | 16oz box | 8oz (half box) | | Rice | 2 lb bag | 1-2 cups dry | ### Context-Aware Scaling **By recipe type:** - Stir fry for 2 β†’ 1 lb protein, 4 cups veggies - Soup/stew β†’ 1.5-2 lbs protein, 4 cups broth - Sheet pan meal β†’ 1.5 lbs protein, 3-4 cups veggies - Appetizers β†’ smaller portions, estimate ~12-15 pieces per batch **By servings mentioned:** - "Serves 4" β†’ Scale standard amounts for 4 - "Meal prep for the week" β†’ Assume 5-8 servings - No servings mentioned β†’ Default to 4 servings **By protein target (if user has macro goals):** - 40-50g protein per serving β†’ ~6-8oz cooked meat per portion - Scale recipe protein accordingly ### Output Format Always present inferred amounts clearly: ``` ### Ingredients - 1 lb ground turkey *(estimated)* - 1 medium onion, diced *(estimated)* - 2 cups broth *(estimated based on typical soup)* ``` Mark inferred quantities with *(estimated)* so user knows what came from the source vs inference. ## Combined Extraction Flow ``` 1. TRY CAPTION (instant) └── yt-dlp --dump-json β†’ parse description └── Recipe found? β†’ DONE βœ… └── Check for "pinned" / "in comments" / "check comments" β†’ FLAG 2. IF FLAGGED: CHECK FOR CREATOR COMMENT └── Look through comments for creator's username └── If creator comment found with recipe β†’ DONE βœ… └── If not found β†’ continue + notify user 3. TRY AUDIO (30-60 sec) └── Download video └── Extract audio with ffmpeg └── Transcribe with Whisper (base model) └── Parse transcript for recipe └── Infer missing measurements └── Recipe found? β†’ DONE βœ… 4. PRESENT RESULTS + PROMPT IF NEEDED └── Show what was extracted from audio └── If "pinned" was flagged, tell user: "The creator mentioned the full recipe is pinned in the comments. I extracted what I could from the audio, but if you want the exact measurements, paste the pinned comment here and I'll merge it with what I found." 5. TRY FRAME ANALYSIS (if audio incomplete) └── Extract 5-8 key frames with ffmpeg └── Send to Claude vision └── Ask: "Extract any recipe text, ingredients, or measurements shown" └── Merge findings with audio transcript 6. FALLBACK (nothing found) └── Inform user: "Recipe wasn't in caption or audio/video" └── Offer: search for similar recipe based on video title/description ``` ## Frame Analysis Extract key frames and analyze with vision model. **Extract frames:** ```bash # Extract 1 frame every 5 seconds ffmpeg -i /tmp/reel.mp4 -vf "fps=1/5" /tmp/frame_%02d.jpg # Or extract specific number of frames evenly distributed ffmpeg -i /tmp/reel.mp4 -vf "select='not(mod(n,30))'" -vsync vfr /tmp/frame_%02d.jpg ``` **Send to vision model:** Use Claude's image analysis to read each frame: - Recipe cards / title screens - Ingredient lists shown on screen - Measurements in text overlays - Step-by-step instructions displayed **Vision prompt:** ``` Analyze this frame from a cooking video. Extract any: - Recipe name or title - Ingredients with quantities - Cooking instructions - Nutritional information / macros - Any other recipe-related text shown If no recipe text is visible, respond with "No recipe text found." ``` **Merge strategy:** - Audio transcript = primary source (spoken instructions) - Frame analysis = supplement (exact measurements, recipe cards) - Combine both, prefer specific measurements from visual over inferred from audio ## Pinned Comment Detection Scan caption for these phrases (case-insensitive): - "recipe pinned" - "pinned in comments" - "check comments" - "in the comments" - "comment below" - "recipe below" - "full recipe in comments" If detected, flag and notify user after extraction: > "Heads up β€” the creator said the recipe is pinned in the comments. > I got what I could from the audio, but yt-dlp can't access pinned comments > without login. If you want the exact recipe, copy the pinned comment and > send it to me β€” I'll format it properly." ## Requirements - `yt-dlp` β€” `brew install yt-dlp` - `ffmpeg` β€” `brew install ffmpeg` - `whisper` β€” `pip3 install openai-whisper` (runs locally, no API key) - No Instagram login required for public reels

More in Shopping & E-commerce