logo
Schedulin

Supported input formats

The clipping studio accepts most common video and audio files. The shorter list of what works well is more useful than the list of what's technically supported.

File types

TypeExtensionsNotes
Video.mp4, .mov, .webm, .mkvMP4 (H.264) is the safest choice
Audio.mp3, .wav, .m4aAudio-only sources render as a waveform clip

Length

There's no hard upper limit, but two things scale with source length:

  • Transcription cost — Deepgram bills per minute of source.
  • Time-to-first-clip — a 60-minute episode usually finishes transcribing in 2–4 minutes; a 3-hour episode can take 10+ minutes.

For pilots, start with a 20–45 minute source so you can see how the highlight selector performs on your content before committing credits to a long file.

Resolution and aspect ratio

We render output clips at 1080×1920 (9:16) regardless of the source. Source resolution doesn't need to match — the renderer crops vertically based on face tracking (if enabled) or centers the frame.

For best results:

  • Source video should be at least 720p. Lower resolutions upscale badly.
  • Horizontal (16:9) sources are ideal — there's room to crop.
  • Already-vertical sources work, but you'll get less benefit from cropping.

Audio quality

The highlight selector is only as good as the transcript. If your audio is noisy or has long stretches of music, expect weaker clip selections. Podcast-quality audio (clean speech, minimal background) gives the best results.

What we don't support

  • Live streams or HLS playlists — upload the finished file.
  • DRM-protected files.
  • Files with multiple audio tracks — only the default track is transcribed.

See also