The production pipeline, end to end

How a publish-ready video gets produced: five phases, a review gate after each.

By StudioCut Team · Updated 2026-05-16

StudioCut is a production pipeline, not a clip generator. Every video flows through the same five phases — and you sign off at the review gate after each one. The output is a finished, on-brand video you can publish, not raw clips to assemble yourself. Stop at any phase, edit, regenerate a single scene, or hand off to Auto-Approve and let the rest finish overnight.

Produce a video See features

Phase 1 · Review gate

Planning — the blueprint

Before a single image is rendered, the pipeline writes the full plan and shows it to you. Nothing advances until you approve it.

What gets produced

A written script — narrative arc with hook, body, and CTA
A per-scene breakdown (visual prompt, duration, on-screen text)
A voice-over script with pacing markers
Animation and transition specs
Suggested thumbnail concepts

These are the same artifacts a production team produces — a script and a scene plan — handed to you before any cost is spent on generation.

What you can do at this gate

Edit any scene's prompt or script line
Reorder scenes, split, merge, delete
Change overall tone or length
Lock specific scenes from regeneration
Sign off and proceed to Phase 2

Phase 2 · Review gate

Asset Creation — parallel generation

Images, voice-over, music, and thumbnails are generated concurrently in parallel workflows. We handle the AI for you — no accounts, no keys, no model juggling. Review every asset before the pipeline moves on.

Images

One per scene, in your chosen style. Regenerate any single image without re-doing the rest.

Voice-over

A real voiceover across every scene. Pick gender, age, accent, and pacing — or use your own voice. Per-scene SSML for emphasis.

Music

Royalty-free track matched to mood and length. Auto-ducked under the voice-over.

Phase 3 · Review gate

Visual Direction — scene-by-scene direction

This is the scene-by-scene direction a production team produces: every scene composed with brand colours, typography, and platform-aware safe zones. Review the direction before it goes to render.

Layouts

Title cards, lower thirds, full-bleed visuals, split screens, picture-in-picture — directed per scene.

Motion

Entrance, emphasis, and exit animations. Subtle Ken Burns on stills. Transitions that match the style.

Phase 4 · Review gate

Rendering — a finished video file

The dedicated renderer service composites every approved asset into a finished MP4 — a publish-ready video, not raw clips. Watch it back before you sign off.

Three formats

Render 16:9, 9:16, and 1:1 from the same source — layouts adapt automatically.

Audio sync

Voice-over locked to scene timing, music ducked, sound effects layered.

Resilient queue

If a render fails, retry from the failed segment, not from scratch.

Phase 5 · Review gate

Publishing Prep — every platform

Metadata, thumbnails, and descriptions tuned per target — the last piece of a publish-ready video. Approve it and the video is ready to post.

Per-platform metadata

YouTube: chapter markers + tags. Shorts/Reels: hashtag pack. TikTok: caption with CTA. LinkedIn: text-first description.

Thumbnails

Three thumbnail variants generated automatically. Pick one, override the text, or upload your own.

Hands-free mode

Auto-Approve — the overnight setting

Toggle once per project. The pipeline runs end-to-end without pauses.

When to use it

Repeat content (daily news, weekly product highlights), high-volume agency runs, batch translation.

Safety rails

Per-phase confidence checks. Low-confidence scenes are flagged for human review even in Auto mode.

Notifications

Email + in-app when the video is ready, or if a phase needs your eye.

Multi-language fan-out

One source. Twenty-four locales.

Approve the English master once. Translations, voice swaps, and burned subtitles happen in parallel.

Source-of-truth

The English (or any) master is canonical. Localised variants link back so corrections cascade.

Voice matching

Per-locale voice presets. Gender and tone are preserved across languages.

Cost preview

Token and render costs are shown per locale before you commit to the batch.

Architecture

Modular pipeline: orchestration + renderer + AI providers

Self-host the full stack or run alongside your existing systems. Every asset the pipeline produces — script, voiceover, scenes, the finished video — is yours. No vendor lock-in.

Core platform: models, controllers, UI, billing, audit log.
Workflow orchestration: coordinates Phase 2 asset generation across AI providers with retry and fan-out.
Renderer service: standalone compositing service — runs on your hardware or ours.
Object storage: any S3-compatible store. Mode-switched via settings.
AI providers: bring your own keys for any supported AI provider. Vendor failover built-in.

How the AI video pipeline works — questions

What are the five phases of AI video generation?

The five phases are Planning, Asset Creation, Visual Direction, Rendering, and Publishing Prep. Each phase has a human review gate, so you can approve or edit the output before the next phase starts.

Can I edit the AI video before it renders?

Yes. At every review gate you can edit the script, regenerate a single scene, swap an image, or adjust visual direction — without re-running the phases you already approved.

What is Auto-Approve mode?

Auto-Approve runs the pipeline hands-free, advancing through every phase without pausing for review. Phase-confidence checks still flag low-quality scenes so you can intervene if needed.

What happens if a render fails partway through?

You resume from the failed segment, not from scratch. Completed phases and rendered scenes are preserved, so a failure costs you one segment rather than the whole video.

Ready to produce a video?

Start with a free account and run your first script, voiceover, and scene plan through the pipeline — you approve every phase.

Get Started Free