AI Podcast ClipperLog in
Features

Podcast Clipper Features Built for Short-Form Video Workflows

Highlight detection, word-level captions, vertical framing, selectable caption language, and a single dashboard to review every result.

Capabilities

Six pieces that replace a five-tab workflow

Each feature is automated end-to-end so you never need to leave the app for a separate transcription or cropping tool.

LLM planning
AI Q&A Clipping
Gemini 2.5 reads word-level transcripts and plans 40-60 second question-and-answer clips that keep full sentence boundaries.
  • 1 to 4 clips per upload, controlled at submit time.
  • Highlights are scored on conversational tension, not pure keyword density.
  • Sentence boundaries respected so playback never feels abrupt.
Word-level
WhisperX Word Subtitles
WhisperX large-v2 transcribes English audio and aligns every word to precise start and end timings.
  • Word JSON makes downstream recuts and syncing painless.
  • Caption timing matches actual speech, not paragraph guesses.
  • Foundation for English captions or Korean translation, depending on the selected run language.
Face-aware
Auto Vertical Framing
Columbia ASD face tracks steer 1080x1920 crops or blurred backgrounds, rendered with NVENC at 25 fps.
  • Active speaker detection per frame so the camera follows the right person.
  • Falls back to blurred backdrop when the face track is uncertain.
  • Output is publish-ready for YouTube Shorts, Reels, and TikTok.
Caption language
English or Korean Captions
Each processing run uses one selected caption language. English captions are sourced from WhisperX; Korean captions come from Gemini translation.
  • Anton style for English emphasis lines.
  • Noto Sans KR style for Korean lines.
  • Choose English or Korean before starting the run.
Signed URLs
Secure S3 Storage
Originals and clips live in a dedicated S3 bucket. The app fetches them only through AWS presigned URLs.
  • Per-user prefixes keep uploads isolated.
  • Presigned URLs expire in 1 hour by default.
  • Cleanup routines remove abandoned drafts.
Dashboard
Dashboard Review Loop
Upload, request processing, review the clip list, play, download, and delete clips from a single view.
  • Status moves from queued to processing to processed without page reloads.
  • Per-clip download and delete actions.
  • Recoverable upload drafts in case the tab closes mid-flow.
Manual vs automated

Where the time actually goes

Manual short-form workflows fan out into multiple tools. The AI pipeline collapses them into one upload.

CapabilityManual workflowAI Podcast Clipper
Find highlight momentsScrub through hours of audio and timestamp by hand.Gemini 2.5 picks 1-4 Q&A moments per upload.
Add word-level captionsHand-time captions or use a generic auto-captioner.WhisperX word timings burned into the clip automatically.
Convert horizontal to verticalManually crop and reposition every cut.Face-aware Columbia ASD crop with blurred backdrop fallback.
Choose caption languageRe-cut or re-caption manually when changing language.English or Korean captions are selected per processing run.

Frequently asked questions

How many clips does each run produce?
You choose 1, 2, 3, or 4 clips per upload. The AI selects the strongest Q&A moments and produces that many vertical clips.
Is Korean captioning the same quality as English?
English captions come directly from WhisperX with word-level timing. Korean captions are produced by Gemini translation styled with Noto Sans KR. Both are usable for publishing, but English will track speech more tightly.
Where are uploads and clips stored?
All originals and generated clips live in a dedicated AWS S3 bucket under per-user prefixes. The app only ever exposes them through short-lived presigned URLs.
What is the file size limit?
Uploads are capped at 900 MB per .mp4. Long episodes still work, but very large files should be exported at a moderate bitrate before upload.