--- Frontmatter fields above are primarily for Claude Code / OpenClaw.
Use when user provides a topic and wants an automated video podcast created, OR when user wants to learn/analyze video design patterns from reference videos β handles research, script writing, TTS audio synthesis, Remotion video creation, and final M
Free to install β no account needed
Copy the command below and paste into your agent.
Instant access β’ No coding needed β’ No account needed
What you get in 5 minutes
- Full skill code ready to install
- Works with 3 AI agents
- Lifetime updates included
Description
--- name: video-podcast-maker description: Use when user provides a topic and wants an automated video podcast created, OR when user wants to learn/analyze video design patterns from reference videos β handles research, script writing, TTS audio synthesis, Remotion video creation, and final MP4 output with background music. Also supports design learning from reference videos (learn command), style profile management, and design reference library. Supports Bilibili, YouTube, Xiaohongshu, Douyin, and WeChat Channels platforms with independent language configuration (zh-CN, en-US). argument-hint: "[topic]" effort: high allowed-tools: Bash, Read, Write, Edit, Glob, Grep, WebFetch, WebSearch, Agent # --- Frontmatter fields above are primarily for Claude Code / OpenClaw. # Other agents such as Codex should ignore unknown fields and follow the workflow below. --- author: Agents365-ai category: Content Creation version: 2.0.0 created: 2025-01-27 updated: 2026-04-03 bilibili: https://space.bilibili.com/441831884 github: https://github.com/Agents365-ai/video-podcast-maker dependencies: - remotion-best-practices metadata: openclaw: requires: bins: - python3 - ffmpeg - node - npx primaryEnv: AZURE_SPEECH_KEY emoji: "π¬" homepage: https://github.com/Agents365-ai/video-podcast-maker os: ["macos", "linux"] install: - kind: brew formula: ffmpeg bins: [ffmpeg] - kind: uv package: edge-tts bins: [edge-tts] --- > **REQUIRED: Load Remotion Best Practices First** > > This skill depends on `remotion-best-practices`. **You MUST invoke it before proceeding:** > ``` > Invoke the skill/tool named: remotion-best-practices > ``` # Video Podcast Maker ## Quick Start Open your coding agent and say: **"Make a video podcast about $ARGUMENTS"** Or invoke directly: `/video-podcast-maker AI Agent tutorial` --- ## Design Learning Extract visual design patterns from reference videos or images, store them in a searchable library, and apply them to new video compositions. ### Commands ```bash # Learn from images (use your agent's image analysis capability to analyze design patterns) python3 learn_design.py ./screenshot1.png ./screenshot2.png # Learn from a local video (ffmpeg extracts frames automatically) python3 learn_design.py ./reference.mp4 # Learn from a URL (Playwright captures screenshots β experimental) python3 learn_design.py https://www.bilibili.com/video/BV1xx411c7mD # Save with a named profile and tags python3 learn_design.py ./reference.mp4 --profile "tech-minimal" --tags "tech,minimal,dark" ``` ### Reference Library Commands ``` references list # List all stored references (auto-cleans orphaned entries) references show <id> # Show full design report for a reference references delete <id> # Delete a reference and its files ``` ### Style Profile Commands ``` profiles list # List all saved style profiles profiles show <name> # Show profile props_override profiles delete <name> # Delete a style profile profiles create <name> # Create a new style profile interactively ``` ### Pre-Workflow Usage When the user provides a reference video or image alongside a video creation request, extract design patterns before Step 1 and apply them as session overrides. See `references/workflow-steps.md` β Pre-workflow section for the full extraction flow. ### Step 9 Integration Before choosing visual design in Step 9, check for matching style profiles or reference library entries. Apply the best match as a starting point for Remotion composition props. See `references/workflow-steps.md` β Step 9 Style Profile Integration for the priority chain. --- ## Auto Update Check **Agent behavior:** Check for updates at most once per day (throttled by timestamp file). Before any shell command that reads files from this skill, resolve `SKILL_DIR` to the directory containing `SKILL.md`. If your agent exposes a built-in skill directory variable such as `${CLAUDE_SKILL_DIR}`, you may map it to `SKILL_DIR`. ```bash SKILL_DIR="${SKILL_DIR:-${CLAUDE_SKILL_DIR}}" STAMP="${SKILL_DIR}/.last_update_check" NOW=$(date +%s) LAST=$(cat "$STAMP" 2>/dev/null || echo 0) if [ $((NOW - LAST)) -gt 86400 ]; then timeout 5 git -C "${SKILL_DIR}" fetch --quiet 2>/dev/null || true LOCAL=$(git -C "${SKILL_DIR}" rev-parse HEAD 2>/dev/null) REMOTE=$(git -C "${SKILL_DIR}" rev-parse origin/main 2>/dev/null) echo "$NOW" > "$STAMP" if [ -n "$LOCAL" ] && [ -n "$REMOTE" ] && [ "$LOCAL" != "$REMOTE" ]; then echo "UPDATE_AVAILABLE" else echo "UP_TO_DATE" fi else echo "SKIPPED_RECENT_CHECK" fi ``` - **Update available**: Ask the user whether to pull updates. Yes β `git -C "${SKILL_DIR}" pull`. No β continue. - **Up to date / Skipped**: Continue silently. --- ## Prerequisites Check !`( missing=""; node -v >/dev/null 2>&1 || missing="$missing node"; python3 --version >/dev/null 2>&1 || missing="$missing python3"; ffmpeg -version >/dev/null 2>&1 || missing="$missing ffmpeg"; [ -n "$AZURE_SPEECH_KEY" ] || missing="$missing AZURE_SPEECH_KEY"; if [ -n "$missing" ]; then echo "MISSING:$missing"; else echo "ALL_OK"; fi )` **If MISSING reported above**, see README.md for full setup instructions (install commands, API key setup, Remotion project init). --- ## Overview Automated pipeline for professional **Bilibili horizontal knowledge videos** from a topic. > **Target: Bilibili horizontal video (16:9)** > - Resolution: 3840Γ2160 (4K) or 1920Γ1080 (1080p) > - Style: Clean white (default) **Tech stack:** Coding agent + TTS backend + Remotion + FFmpeg ### Output Specs | Parameter | Horizontal (16:9) | Vertical (9:16) | |-----------|-------------------|-----------------| | **Resolution** | 3840Γ2160 (4K) | 2160Γ3840 (4K) | | **Frame rate** | 30 fps | 30 fps | | **Encoding** | H.264, 16Mbps | H.264, 16Mbps | | **Audio** | AAC, 192kbps | AAC, 192kbps | | **Duration** | 1-15 min | 60-90s (highlight) | --- ## Execution Modes **Agent behavior:** Detect user intent at workflow start: - "Make a video about..." / no special instructions β **Auto Mode** - "I want to control each step" / mentions interactive β **Interactive Mode** - Default: **Auto Mode** ### Auto Mode (Default) Full pipeline with sensible defaults. **Mandatory stop at Step 9:** 1. **Step 9**: Launch Remotion Studio β user reviews in real-time, requests changes until satisfied 2. **Step 10**: Only triggered when user explicitly says "render 4K" / "render final version" | Step | Decision | Auto Default | |------|----------|-------------| | 3 | Title position | top-center | | 5 | Media assets | Skip (text-only animations) | | 7 | Thumbnail method | Remotion-generated (16:9 + 4:3) | | 9 | Outro animation | Pre-made MP4 (white/black by theme) | | 9 | Preview method | Remotion Studio (mandatory) | | 12 | Subtitles | Skip | | 14 | Cleanup | Auto-clean temp files | Users can override any default in their initial request: - "make a video about AI, burn subtitles" β auto + subtitles on - "use dark theme, AI thumbnails" β auto + dark + imagen - "need screenshots" β auto + media collection enabled ### Interactive Mode Prompts at each decision point. Activated by: - "interactive mode" / "I want to choose each option" - User explicitly requests control --- ## Workflow State & Resume > **Planned feature (not yet implemented).** Currently, workflow progress is tracked via the agent's conversation context. If a session is interrupted, re-invoke the skill and inspect existing files in `videos/{name}/` to determine where to resume. --- ## Technical Rules Hard constraints for video production. Visual design remains the agent's creative freedom within these rules: | Rule | Requirement | |------|-------------| | **Single Project** | All videos under `videos/{name}/` in user's Remotion project. NEVER create a new project per video. | | **4K Output** | 3840Γ2160, use `scale(2)` wrapper over 1920Γ1080 design space | | **Content Width** | β₯85% of screen width | | **Bottom Safe Zone** | Bottom 100px reserved for subtitles | | **Audio Sync** | All animations driven by `timing.json` timestamps | | **Thumbnail** | MUST generate 16:9 (1920Γ1080) AND 4:3 (1200Γ900). Centered layout, title β₯120px, icons β₯120px, fill most of canvas. See design-guide.md. | | **Font** | PingFang SC / Noto Sans SC for Chinese text | | **Studio Before Render** | MUST launch `remotion studio` for user review. NEVER render 4K until user explicitly confirms ("render 4K", "render final"). | --- ## Additional Resources Load these files on demand β **do NOT load all at once**: - **[references/workflow-steps.md](references/workflow-steps.md)**: Detailed step-by-step instructions (Steps 1-14). Load at workflow start. - **[references/design-guide.md](references/design-guide.md)**: Visual minimums, typography, layout patterns, checklists. **MUST load before Step 9.** - **[references/troubleshooting.md](references/troubleshooting.md)**: Error fixes, BGM options, preference commands, preference learning. Load on error or user request. - **[examples/](examples/)**: Real production video projects. The agent may reference these for composition structure and `timing.json` format. --- ## Directory Structure ``` project-root/ # Remotion project root βββ src/remotion/ # Remotion source β βββ compositions/ # Video composition definitions β βββ Root.tsx # Remotion entry β βββ index.ts # Exports β βββ public/ # Remotion default (unused β use --public-dir videos/{name}/) β βββ videos/{video-name}/ # Video project assets β βββ workflow_state.json # Workflow progress β βββ topic_definition.md # Step 1 β βββ topic_research.md # Step 2 β βββ podcast.txt # Step 4: narration script β βββ podcast_audio.wav # Step 8: TTS audio β βββ podcast_audio.srt # Step 8: subtitles β βββ timing.json # Step 8: timeline β βββ thumbnail_*.png # Step 7 β βββ output.mp4 # Step 10 β βββ video_with_bgm.mp4 # Step 11 β βββ final_video.mp4 # Step 12: final output β βββ bgm.mp3 # Background music β βββ remotion.config.ts ``` > **Important**: Always use `--public-dir` and full output path for Remotion render: > ```bash > npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/ > ``` ### Naming Rules **Video name `{video-name}`**: lowercase English, hyphen-separated (e.g., `reference-manager-comparison`) **Section name `{section}`**: lowercase English, underscore-separated, matches `[SECTION:xxx]` **Thumbnail naming** (16:9 AND 4:3 both required): | Type | 16:9 | 4:3 | |------|------|-----| | Remotion | `thumbnail_remotion_16x9.png` | `thumbnail_remotion_4x3.png` | | AI | `thumbnail_ai_16x9.png` | `thumbnail_ai_4x3.png` | ### Public Directory Use `--public-dir videos/{name}/` for all Remotion commands. Each video's assets (timing.json, podcast_audio.wav, bgm.mp3) stay in its own directory β no copying to `public/` needed. This enables parallel renders of different videos. ```bash # All render/studio/still commands use --public-dir npx remotion studio src/remotion/index.ts --public-dir videos/{name}/ npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/ --video-bitrate 16M npx remotion still src/remotion/index.ts Thumbnail16x9 videos/{name}/thumbnail.png --public-dir videos/{name}/ ``` --- ## Workflow ### Progress Tracking At Step 1 start: 1. Create `videos/{name}/workflow_state.json` 2. Use `TaskCreate` to create tasks per step. Mark `in_progress` on start, `completed` on finish. 3. Each step updates BOTH `workflow_state.json` AND TaskUpdate. ``` 1. Define topic direction β topic_definition.md 2. Research topic β topic_research.md 3. Design video sections (5-7 chapters) 4. Write narration script β podcast.txt 5. Collect media assets β media_manifest.json 6. Generate publish info (Part 1) β publish_info.md 7. Generate thumbnails (16:9 + 4:3) β thumbnail_*.png 8. Generate TTS audio β podcast_audio.wav, timing.json 9. Create Remotion composition + Studio preview (mandatory stop) 10. Render 4K video (only on user request) β output.mp4 11. Mix background music β video_with_bgm.mp4 12. Add subtitles (optional) β final_video.mp4 13. Complete publish info (Part 2) β chapter timestamps 14. Verify output & cleanup 15. Generate vertical shorts (optional) β shorts/ ``` ### Validation Checkpoints **After Step 8 (TTS)**: - [ ] `podcast_audio.wav` exists and plays correctly - [ ] `timing.json` has all sections with correct timestamps - [ ] `podcast_audio.srt` encoding is UTF-8 **After Step 10 (Render)**: - [ ] `output.mp4` resolution is 3840x2160 - [ ] Audio-video sync verified - [ ] No black frames --- ## Key Commands Reference See **CLAUDE.md** for the full command reference (TTS, Remotion, FFmpeg, shorts generation). --- ## User Preference System Skill learns and applies preferences automatically. See [references/troubleshooting.md](references/troubleshooting.md) for commands and learning details. ### Storage Files | File | Purpose | |------|---------| | `user_prefs.json` | Learned preferences (auto-created from template) | | `user_prefs.template.json` | Default values | | `prefs_schema.json` | JSON schema definition | ### Priority ``` Final = merge(Root.tsx defaults < global < topic_patterns[type] < current instructions) ``` ### User Commands | Command | Effect | |---------|--------| | "show preferences" | Show current preferences | | "reset preferences" | Reset to defaults | | "save as X default" | Save to topic_patterns | --- ## Troubleshooting & Preferences > **Full reference:** Read [references/troubleshooting.md](references/troubleshooting.md) on errors, preference questions, or BGM options.
Security Status
Unvetted
Not yet security scanned
Related AI Tools
More Grow Business tools you might like
Clawra Selfie
FreeEdit Clawra's reference image with Grok Imagine (xAI Aurora) and send selfies to messaging channels via OpenClaw
Agent Skills for Context Engineering
FreeA comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems. Use when building, optimizing, or debugging agent systems that require effective context management.
Terraform Skill for Claude
FreeUse when working with Terraform or OpenTofu - creating modules, writing tests (native test framework, Terratest), setting up CI/CD pipelines, reviewing configurations, choosing between testing approaches, debugging state issues, implementing security
NotebookLM Research Assistant Skill
FreeUse this skill to query your Google NotebookLM notebooks directly from Claude Code for source-grounded, citation-backed answers from Gemini. Browser automation, library management, persistent auth. Drastically reduced hallucinations through document-
Engineering Advanced Skills (POWERFUL Tier)
Free"25 advanced engineering agent skills and plugins for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw. Agent design, RAG, MCP servers, CI/CD, database design, observability, security auditing, release management, platform ops."
Clawra Selfie
FreeEdit Clawra's reference image with Grok Imagine (xAI Aurora) and send selfies to messaging channels via OpenClaw