Turn Podcasts Into Shorts With AI
A 90 minute podcast does not become Shorts in a vacuum. AI Podcast Clipper finds the moments that survive in a 60 second window and ships them captioned and vertical.
Manual clipping fails on long-form conversation
Podcast hosts move between setup, joke, and payoff. Picking a clip that lands without context is a separate skill - and it does not scale across an entire show.
- Most highlight tools target keynote talks, not back-and-forth dialogue.
- Hand-editing a single Short can take 20-30 minutes per clip once you include cropping and captioning.
- Multi-language publishing doubles the manual cost without adding new highlights.
From upload to published-ready in one pass
- 1
Drop the long episode in
Upload a podcast .mp4 up to 900 MB. No need to pre-edit or trim - the AI handles the cut.
- 2
AI scores Q&A density
Gemini 2.5 reads the transcript and ranks 40-60 second segments where a question lands a clear answer.
- 3
Captions and 9:16 framing run together
WhisperX timing and Columbia ASD face tracking happen in the same pass, not as separate exports.
- 4
Review and download
Each clip renders to S3 and is downloadable through a presigned URL inside the dashboard.
One export, every short-form surface
Vertical mp4 with burned-in captions ready for the YouTube Shorts shelf.
Same export feeds Reels - no extra crop or caption pass needed.
Drop the file in directly. Captions are already burned in for sound-off viewers.
Use the highlight clips as the cold open of a long-form video on any platform.
Need a YouTube-specific workflow?
If YouTube Shorts is the primary channel, use the YouTube Shorts generator page for Shorts-specific requirements and review steps.
Frequently asked questions
- Does this only work for YouTube Shorts?
- No. The output is a 1080x1920 vertical mp4 with burned-in captions, which is the same shape Instagram Reels and TikTok expect. One run gives you a clip you can publish on all three platforms.
- Will the AI cut clips at the wrong place?
- Highlights are scored on Q&A boundaries, not arbitrary timestamps. The pipeline preserves full sentence boundaries so the clip starts and ends on a natural beat.
- What happens to original audio quality?
- Audio is preserved from the source mp4. Only captions and vertical framing are added on top - the underlying audio is not re-encoded beyond what the export step requires.
- Can I generate clips in Korean?
- Yes. Choose Korean as the caption language for that processing run. The output is a Korean-captioned vertical mp4 styled with Noto Sans KR.