How Pitchstage works
A step-by-step guide · For the machine-readable API spec see the headless API reference.
Pitchstage turns your product into a launch kit. Point it at your app — paste a URL or drop a handful of screenshots — add one paragraph about what you shipped, and an AI director studies it, writes an editable outline in your voice, binds camera moves to every line, and renders a narrated demo video, a LinkedIn carousel, and ready-to-post launch copy. Use it in the app (no setup) or drive it from an agent, a terminal, or CI.
1. The mental model
project → outline → scenes → artifacts
- Project — your launch: the product input (URL or screenshots) + a goal.
- Outline — the director’s draft script: a hook, beats, and a call-to-action. Fully editable — it’s the single source the artifacts render from.
- Scenes — each screenshot becomes a scene with narration + a camera move.
- Artifacts — what ships: a demo video (16:9 / 9:16 / 1:1), a carousel, and copy.
Everything downstream comes from the outline, so editing the outline (or a scene) re-shapes the video and carousel together.
2. Make your first launch (in the app)
No install, no API key. Roughly two minutes of input, then a render.
- New project → pick a launch type. Feature drop, major release, funding milestone, Product Hunt launch, changelog, or custom — this sets the narrative shape.
- Add your product. Paste your app URL and the director drives it like a person and captures each screen, or drop 3–20 screenshots (PNG/JPG/HEIC, ≤10 MB each). Behind a login? See “behind a login” below.
- Describe what you shipped — one paragraph. This steers the outline; you can edit or re-draft it.
- Review the outline. The director drafts the hook, the beats (one per feature), and the CTA. Edit any line — this is your script.
- Brand + voice (optional). A brand kit (colors, font) and a voice profile (paste a few of your posts; it learns your tone) make the output look and sound like you. There’s always a sensible default, so you can skip this.
- Pick your artifacts and per-artifact options (aspect ratio, captions, music, hook and CTA cards, smart-zoom, transitions).
- Refine scenes (optional). Per scene you can tune narration, duration, importance, the camera move, an on-screen pull-quote, and the layout.
- Render. Watch progress live (~30–90s for a real flow); download the video, carousel, and copy when it’s done.
3. What the director can do
The director isn’t a template filler — it makes editorial decisions and reconciles incoherent input (and tells you what it changed). Its toolkit:
Narrative templates
feature_drop · major_release · funding_milestone · product_hunt_launch · changelog · custom — each sets a different pace, opening, and CTA.
Adaptive openings
The director picks how to open for your audience: end_state_first (lead with the outcome), cold_open_problem (lead with the pain), straight_to_product (jump in — good for developer audiences), or brief_intro.
Motion (per scene)
| Motion | When the director uses it |
|---|---|
kenburns | gentle eased zoom — the default for a calm screen |
punch_in | hard cut to tighter framing — clicks / state changes |
spotlight | pushes into an AI-detected focus region |
cursor | follows a cursor movement through the UI |
callouts | annotations / highlights on the screen |
slow_pan | eased pan across a wide screen or long list |
static_hold | no motion — for text-heavy screens |
none | a plain still |
And
- Hero emphasis — your one differentiating feature gets extra screen-time, words, and motion.
- Smart zoom — pushes into the key element as the narration reaches it.
- Transitions — hard cuts or crossfades.
- Kinetic text — a ≤6-word pull-quote on screen, separate from the spoken narration.
- Portrait layout (9:16) — a phone-frame split or a blurred backdrop.
- Music — none, a library track, or your own upload, auto-ducked under narration.
- Captions —
karaoke(word-by-word),plain,sparse, oroff. - Your voice + brand — narration and copy in your tone; your colors and font on every frame.
4. What you can configure (the plan)
Every render is described by one resolved plan. It’s tiered: stick to intent for the 80% case; reach into director and scenes for fine control. The app edits these for you; agents send them as JSON.
Tier 1 — intent (the 80%)
| Field | Meaning |
|---|---|
artifactType | demo_video_landscape · linkedin_carousel · launch_copy |
goal | what this launch should achieve (free text) |
audience | founders · developers · enterprise · consumer · existing_users · waitlist |
tone | warm · direct · playful · technical · urgent |
aspect | 16:9 · 9:16 · 1:1 |
targetLengthS | desired length in seconds (10–600, or auto) |
captionStyle | karaoke · plain · sparse · off |
brandKitId / voiceProfileId | use a saved brand kit / voice |
Tier 2 — director (stable)
| Field | Meaning |
|---|---|
outline | hook, problem, solution, beats[], callToAction, close |
hook / cta | enable + text for the opening/closing cards |
music | source (none/library/upload), track, gain, ducking |
smartZoom | push into the key element as narration reaches it |
transitions | none · crossfade |
portraitLayout | split · blur (9:16 only) |
narrativeTemplate | one of the six launch types |
openingStyle | the adaptive opening (director-decided) |
heroBeatIdx | which beat gets the hero treatment |
brand / voice | resolved colors/font + voice attributes & style rules |
Tier 3 — scenes (advanced, no stability promise)
Per-scene overrides: durationS, narrationText, emotion, importance, kineticText, emphasisPhrase, motionPlan, layout (full_bleed / device_frame / split / collage / slide), and more.
The director reconciles — it never silently obeys
Send something incoherent and it fixes it, returning a warnings[] note so you know:
| Warning | What it means |
|---|---|
TARGET_LENGTH_CLAMPED | your length was raised to the floor for the scene count |
LAYOUT_IGNORED_LANDSCAPE | a portrait layout was ignored on a non-9:16 aspect |
MUSIC_DROPPED_NO_URL | an upload track with no file → music dropped to none |
HERO_BEAT_CLAMPED | the hero beat index was clamped into range |
KINETIC_TRUNCATED | a pull-quote over 6 words was trimmed |
5. Drive it with agents, the CLI, or the API
The same engine is fully headless — point an AI agent, a terminal, or CI at it. It’s a two-phase loop: a cheap plan (resolved config + preview frames + a fit-score, no render cost) that you iterate on, then one billable render.
First: mint an API key
In the app go to Settings → API keys → Create key. Choose scopes — plan (read-only, free) and/or render (billable) — and a monthly spend cap. The raw key pk_live_… is shown once. Defaults per key: 60 requests/min and a $5/mo spend cap.
A. From an AI agent (MCP — Claude, Cursor)
- Mint a
render-scope key (above). - Add the Pitchstage MCP server to your client config and paste your key:
{
"mcpServers": {
"pitchstage": {
"command": "npx",
"args": ["-y", "@pitchstage/mcp"],
"env": { "PITCHSTAGE_API_KEY": "pk_live_…" }
}
}
}Restart the client. The agent now has five tools:
| Tool | What it does |
|---|---|
pitchstage_create_project | create a project from a goal → presigned upload URLs |
pitchstage_plan | run the director without rendering → resolved plan + fit-score + preview frames (as inline images the agent can SEE) |
pitchstage_render | commit a plan → fork a variant + enqueue the render |
pitchstage_get_render | inspect a render: status, outputs, transcript, poster, plan |
pitchstage_list_variants | list a project’s variants |
A natural agent loop: create_project → upload screenshots → plan → look at the preview frames + fit-score, tweak the plan, plan again → render → get_render. Because plan returns the frames as images, the agent can judge and improve the result before spending a render. (Optional base URL override: PITCHSTAGE_API_URL, default https://pitchstage.ai/api.)
B. From the terminal / CI (CLI)
npm i -g pitchstage # or: pnpm add -g pitchstage export PITCHSTAGE_API_KEY=pk_live_…
Point it at a folder of screenshots. Each subfolder is a feature (one outline beat); loose images become an “Overview”:
screenshots/ Inbox/ a.png b.png → feature "Inbox" Settings/ c.jpg → feature "Settings" cover.png → "Overview"
| Command | What it does |
|---|---|
pitchstage plan <dir> --goal "…" | ingest the folder, run a plan, print the resolved plan + fit-score |
pitchstage render <dir> --goal "…" | plan then commit a render; polls to completion |
pitchstage render … --watch | re-plan on file changes — keep a demo in sync in CI |
pitchstage get <renderId> | print a render’s machine-readable result |
pitchstage keys | show the configured key + base URL |
pitchstage render ./screenshots --goal "30s launch teaser for developers"
C. Straight HTTP (the API)
Prefer raw requests? Every endpoint, the resolved-plan contract, error codes, and limits are in the headless API reference (the machine-readable spec is at /api/v1/openapi.json).
Q&A
Do I have to write a script?
My app is behind a login — how do I capture it?
Will it sound like me?
How long does a render take?
Is planning free?
plan phase is read-only and cheap — it runs the director and returns preview frames + a fit-score with no render. You only pay when you render. That’s what keeps iterating (especially for agents) inexpensive.Can an AI agent run the whole thing end to end?
plan, judge the result, refine the plan, and only then render.How do I keep a demo video in sync with my product?
pitchstage render ./screenshots --goal "…" --watch in CI — it re-plans when the screenshots change.Can I tweak one part without redoing everything?
What formats do I get?
16:9, 9:16, or 1:1; a LinkedIn carousel; and ready-to-post launch copy (LinkedIn, X, email).Will it post something wrong or off-brand under my name?
What are the limits / costs for the API?
plan-scope key can’t trigger a billable render. Rate- limited responses include a Retry-After header.Is my uploaded data / login session stored?
Still stuck? Open the API reference or head to your dashboard and start a project.