A tool for understanding podcasts and long-form videos.
Podcast-Agent turns YouTube podcasts and long-form videos into structured reports for faster understanding and analysis.
Overview | Architecture | Project Structure | Installation | Quick Start | CLI Usage
Podcast-Agent is useful when you want to:
- Quickly understand what a podcast or long-form video is about.
- Generate shareable reports in multiple formats for easier distribution.
- Save intermediate artifacts for review, debugging, or downstream analysis.
Current input support is centered on YouTube videos.
Podcast-Agent is organized around four core layers:
-
Content Ingestion
Capture essential podcast and video elements, including metadata, transcripts, and contextual signals. -
Semantic Extraction
Analyze raw content around the user's question to identify relevant evidence, key moments, and meaningful context. -
Insight Structuring
Organize extracted information into core viewpoints, logical relationships, and a coherent analytical framework. -
Report Generation
Assemble metadata, evidence, viewpoints, and summaries into a polished structured report for fast understanding.
src/podcast_agent/
├── sources/ Source detection and source adapters
├── elements/ Metadata, transcript fetching, and formatting
├── transcribers/ Audio transcription fallback
├── insights/ Evidence, outline, viewpoint, and summary generation
├── pipeline/ Pipeline orchestration, context, and artifact handling
├── reports/ Markdown, HTML, PDF, and Xiaohongshu report rendering
└── cli/ Command-line entry points
Podcast-Agent requires Python 3.10 or later.
python -m venv .venv
.venv/bin/pip install -e ".[dev]"Create a local environment file:
cp .env.example .envThen fill in the configuration needed for your runtime, such as LLM credentials, YouTube cookies, Aliyun ASR, OSS, and the ffmpeg binary path.
Run the bundled batch script with the default example cases:
scripts/run-full-batch.shTo use a custom cases file, output directory, or concurrency level:
CASES_PATH=examples/full-report-cases.json \
OUTPUT_ROOT=output \
MAX_JOBS=3 \
scripts/run-full-batch.shThe final report will be generated at:
output/<case-id>/reports/report.md
output/<case-id>/reports/report.html
output/<case-id>/reports/report.pdf
output/<case-id>/reports/xhs/images/
Run the full pipeline from the command line:
.venv/bin/podcast-agent full \
--url "https://www.youtube.com/watch?v=<video-id>" \
--question "Your question about the video" \
--output-dir output/my-reportFor a complete command reference, see CLI Usage Guide.