Video Knowledge Ingest
Use this skill as the default cross-platform video → transcript → summary → local-knowledge workflow.
Quick start
- Run the bundled entrypoint:
skills/video-knowledge-ingest/scripts/video-ingest.sh "<url-or-local-file>"
- Read the JSON stdout for paths.
- Send the summary back to the user from
summary.md. - Keep the stored files in the local knowledge base; do not move them unless asked.
Default knowledge-base root:
/home/jason/.openclaw/workspace/knowledge/video-notes/
What this skill invokes
Core tools in the normal path:
yt-dlp— resolve metadata, fetch subtitles, or download mediaffmpeg/ffprobe— normalize audio before transcription- bundled
scripts/whisper-gpu.sh— local Whisper transcription using the workspace GPU venv summarize --cli codex— generate the final written summary- local filesystem — persist transcript, summary, metadata, and index entries
Platform-specific notes:
- YouTube: prefer subtitles when available; fall back to media download + Whisper
- Bilibili: often falls back to Whisper; the script auto-normalizes
bilibili.com/...towww.bilibili.com/...and stripsspm_tracking params - Xiaohongshu: usually no subtitles; expect media download + Whisper
- Local subtitle/text files: skip download and summarize directly
- Local media files: skip
yt-dlp; go straight to Whisper
Workflow
1. Normalize the source
- If the input is a URL, use the bundled normalizer.
- Keep YouTube timing parameters (
t,start,list,index) but drop common tracking params. - For Bilibili, force
www.bilibili.comand removespm_*query params.
2. Try subtitles first
- Run
yt-dlpin subtitle-only mode. - Prefer
zh.*anden.*subtitles. - Treat subtitle download as best effort.
- If any usable
.srt/.vttfile lands, continue with that file even if another subtitle variant returned a non-zero exit code.
3. Fall back to media + Whisper
If no usable subtitles land:
- Download best audio/media with
yt-dlp - Transcribe with bundled
scripts/whisper-gpu.sh - If GPU transcription fails, the script falls back to CPU automatically
4. Summarize
- Summarize the resulting transcript with
summarize --cli codex --force-summary - Expect
codexto be installed and logged in, or configure the summarize backend another way before use
5. Persist results
For each ingested item, keep these files:
source.urlorsource.pathsource.info.jsondownloads/(when remote media/subtitles are fetched)whisper/(when Whisper was used)transcript.txtsummary.mdrecord.json- global append-only index:
knowledge/video-notes/index.jsonl
Common commands
Remote URL:
skills/video-knowledge-ingest/scripts/video-ingest.sh "https://www.youtube.com/watch?v=..."
skills/video-knowledge-ingest/scripts/video-ingest.sh "https://bilibili.com/video/BV..."
skills/video-knowledge-ingest/scripts/video-ingest.sh "https://www.xiaohongshu.com/explore/..."
Local files:
skills/video-knowledge-ingest/scripts/video-ingest.sh /path/to/file.srt
skills/video-knowledge-ingest/scripts/video-ingest.sh /path/to/file.mp4
Custom output root:
skills/video-knowledge-ingest/scripts/video-ingest.sh "<source>" --kb-root /some/other/root
When to read bundled references
Read references/toolchain.md when you need:
- dependency details
- exact file layout
- how each tool is used in the pipeline
Read references/troubleshooting.md when you hit:
- YouTube anti-bot / cookies issues
- Bilibili 403 on shared links
- subtitle 429 / partial subtitle failures
- Xiaohongshu subtitle absence
- summarize / codex auth failures
- Whisper venv, CUDA, ffmpeg, or yt-dlp problems
Operating rules
- Prefer the bundled
scripts/video-ingest.shentrypoint over re-implementing the workflow. - Do not skip the local knowledge-base write unless explicitly asked.
- When a run fails, inspect the generated directory before declaring total failure; partial artifacts often explain the real issue.
- If a platform provides subtitles, prefer them over Whisper.
- If subtitles are absent or unusable, fall back to media + Whisper automatically.
微信扫一扫