← Back to Pitchstage

How Pitchstage works

A step-by-step guide · For the machine-readable API spec see the headless API reference.

Pitchstage turns your product into a launch kit. Point it at your app — paste a URL or drop a handful of screenshots — add one paragraph about what you shipped, and an AI director studies it, writes an editable outline in your voice, binds camera moves to every line, and renders a narrated demo video, a LinkedIn carousel, and ready-to-post launch copy. Use it in the app (no setup) or drive it from an agent, a terminal, or CI.

1. The mental model

project  →  outline  →  scenes  →  artifacts

Everything downstream comes from the outline, so editing the outline (or a scene) re-shapes the video and carousel together.

2. Make your first launch (in the app)

No install, no API key. Roughly two minutes of input, then a render.

  1. New project → pick a launch type. Feature drop, major release, funding milestone, Product Hunt launch, changelog, or custom — this sets the narrative shape.
  2. Add your product. Paste your app URL and the director drives it like a person and captures each screen, or drop 3–20 screenshots (PNG/JPG/HEIC, ≤10 MB each). Behind a login? See “behind a login” below.
  3. Describe what you shipped — one paragraph. This steers the outline; you can edit or re-draft it.
  4. Review the outline. The director drafts the hook, the beats (one per feature), and the CTA. Edit any line — this is your script.
  5. Brand + voice (optional). A brand kit (colors, font) and a voice profile (paste a few of your posts; it learns your tone) make the output look and sound like you. There’s always a sensible default, so you can skip this.
  6. Pick your artifacts and per-artifact options (aspect ratio, captions, music, hook and CTA cards, smart-zoom, transitions).
  7. Refine scenes (optional). Per scene you can tune narration, duration, importance, the camera move, an on-screen pull-quote, and the layout.
  8. Render. Watch progress live (~30–90s for a real flow); download the video, carousel, and copy when it’s done.

3. What the director can do

The director isn’t a template filler — it makes editorial decisions and reconciles incoherent input (and tells you what it changed). Its toolkit:

Narrative templates

feature_drop · major_release · funding_milestone · product_hunt_launch · changelog · custom — each sets a different pace, opening, and CTA.

Adaptive openings

The director picks how to open for your audience: end_state_first (lead with the outcome), cold_open_problem (lead with the pain), straight_to_product (jump in — good for developer audiences), or brief_intro.

Motion (per scene)

MotionWhen the director uses it
kenburnsgentle eased zoom — the default for a calm screen
punch_inhard cut to tighter framing — clicks / state changes
spotlightpushes into an AI-detected focus region
cursorfollows a cursor movement through the UI
calloutsannotations / highlights on the screen
slow_paneased pan across a wide screen or long list
static_holdno motion — for text-heavy screens
nonea plain still

And

4. What you can configure (the plan)

Every render is described by one resolved plan. It’s tiered: stick to intent for the 80% case; reach into director and scenes for fine control. The app edits these for you; agents send them as JSON.

Tier 1 — intent (the 80%)

FieldMeaning
artifactTypedemo_video_landscape · linkedin_carousel · launch_copy
goalwhat this launch should achieve (free text)
audiencefounders · developers · enterprise · consumer · existing_users · waitlist
tonewarm · direct · playful · technical · urgent
aspect16:9 · 9:16 · 1:1
targetLengthSdesired length in seconds (10–600, or auto)
captionStylekaraoke · plain · sparse · off
brandKitId / voiceProfileIduse a saved brand kit / voice

Tier 2 — director (stable)

FieldMeaning
outlinehook, problem, solution, beats[], callToAction, close
hook / ctaenable + text for the opening/closing cards
musicsource (none/library/upload), track, gain, ducking
smartZoompush into the key element as narration reaches it
transitionsnone · crossfade
portraitLayoutsplit · blur (9:16 only)
narrativeTemplateone of the six launch types
openingStylethe adaptive opening (director-decided)
heroBeatIdxwhich beat gets the hero treatment
brand / voiceresolved colors/font + voice attributes & style rules

Tier 3 — scenes (advanced, no stability promise)

Per-scene overrides: durationS, narrationText, emotion, importance, kineticText, emphasisPhrase, motionPlan, layout (full_bleed / device_frame / split / collage / slide), and more.

The director reconciles — it never silently obeys

Send something incoherent and it fixes it, returning a warnings[] note so you know:

WarningWhat it means
TARGET_LENGTH_CLAMPEDyour length was raised to the floor for the scene count
LAYOUT_IGNORED_LANDSCAPEa portrait layout was ignored on a non-9:16 aspect
MUSIC_DROPPED_NO_URLan upload track with no file → music dropped to none
HERO_BEAT_CLAMPEDthe hero beat index was clamped into range
KINETIC_TRUNCATEDa pull-quote over 6 words was trimmed

5. Drive it with agents, the CLI, or the API

The same engine is fully headless — point an AI agent, a terminal, or CI at it. It’s a two-phase loop: a cheap plan (resolved config + preview frames + a fit-score, no render cost) that you iterate on, then one billable render.

First: mint an API key

In the app go to Settings → API keys Create key. Choose scopes — plan (read-only, free) and/or render (billable) — and a monthly spend cap. The raw key pk_live_… is shown once. Defaults per key: 60 requests/min and a $5/mo spend cap.

A. From an AI agent (MCP — Claude, Cursor)

  1. Mint a render-scope key (above).
  2. Add the Pitchstage MCP server to your client config and paste your key:
{
  "mcpServers": {
    "pitchstage": {
      "command": "npx",
      "args": ["-y", "@pitchstage/mcp"],
      "env": { "PITCHSTAGE_API_KEY": "pk_live_…" }
    }
  }
}

Restart the client. The agent now has five tools:

ToolWhat it does
pitchstage_create_projectcreate a project from a goal → presigned upload URLs
pitchstage_planrun the director without rendering → resolved plan + fit-score + preview frames (as inline images the agent can SEE)
pitchstage_rendercommit a plan → fork a variant + enqueue the render
pitchstage_get_renderinspect a render: status, outputs, transcript, poster, plan
pitchstage_list_variantslist a project’s variants

A natural agent loop: create_project → upload screenshots → plan look at the preview frames + fit-score, tweak the plan, plan againrender get_render. Because plan returns the frames as images, the agent can judge and improve the result before spending a render. (Optional base URL override: PITCHSTAGE_API_URL, default https://pitchstage.ai/api.)

B. From the terminal / CI (CLI)

npm i -g pitchstage          # or: pnpm add -g pitchstage
export PITCHSTAGE_API_KEY=pk_live_…

Point it at a folder of screenshots. Each subfolder is a feature (one outline beat); loose images become an “Overview”:

screenshots/
  Inbox/      a.png  b.png    →  feature "Inbox"
  Settings/   c.jpg           →  feature "Settings"
  cover.png                   →  "Overview"
CommandWhat it does
pitchstage plan <dir> --goal "…"ingest the folder, run a plan, print the resolved plan + fit-score
pitchstage render <dir> --goal "…"plan then commit a render; polls to completion
pitchstage render … --watchre-plan on file changes — keep a demo in sync in CI
pitchstage get <renderId>print a render’s machine-readable result
pitchstage keysshow the configured key + base URL
pitchstage render ./screenshots --goal "30s launch teaser for developers"

C. Straight HTTP (the API)

Prefer raw requests? Every endpoint, the resolved-plan contract, error codes, and limits are in the headless API reference (the machine-readable spec is at /api/v1/openapi.json).

Q&A

Do I have to write a script?

No. The director drafts the narration and outline from your screenshots + paragraph. You edit what you want and leave the rest.

My app is behind a login — how do I capture it?

Easiest is to drop screenshots of the signed-in screens (you’re already looking at them). To have the director drive your live app, use the in-app “Log in here” hosted browser (nothing to install — Google/SSO and 2FA work, and your session is used once and never stored). A one-click browser extension is coming to the Chrome Web Store.

Will it sound like me?

Set up a voice profile (paste a few of your posts) and the narration + copy are written in your tone. Without one, a clean default voice is used.

How long does a render take?

Usually ~30–90 seconds for a real flow. You can leave the page and come back; agents/CLI poll for you.

Is planning free?

The plan phase is read-only and cheap — it runs the director and returns preview frames + a fit-score with no render. You only pay when you render. That’s what keeps iterating (especially for agents) inexpensive.

Can an AI agent run the whole thing end to end?

Yes — that’s what the MCP server is for. The agent can see the preview frames from plan, judge the result, refine the plan, and only then render.

How do I keep a demo video in sync with my product?

Run pitchstage render ./screenshots --goal "…" --watch in CI — it re-plans when the screenshots change.

Can I tweak one part without redoing everything?

Yes — regenerate is scoped (whole / a single scene / audio / video / caption), so you can refresh just the music or one scene’s narration without a full re-render.

What formats do I get?

A narrated demo video in 16:9, 9:16, or 1:1; a LinkedIn carousel; and ready-to-post launch copy (LinkedIn, X, email).

Will it post something wrong or off-brand under my name?

Nothing publishes automatically — you review and download. The director reconciles incoherent settings and tells you what it changed (see the warnings above), and everything renders in your brand + voice.

What are the limits / costs for the API?

Per key: 60 requests/min and a $5/mo spend cap by default (you set the cap when you create the key). A plan-scope key can’t trigger a billable render. Rate- limited responses include a Retry-After header.

Is my uploaded data / login session stored?

Screenshots and outputs live in your workspace. A login session captured for a behind-the-login capture is used once by the worker and never stored.

Still stuck? Open the API reference or head to your dashboard and start a project.