Documentation
Complete configuration reference for DemoDSL v2.7.0
Overview
DemoDSL is a DSL-driven automated product demo video generator. You describe your demo in a single YAML or JSON configuration file covering browser automation, voice narration, visual effects, video editing, audio mixing, and multi-format export. DemoDSL then orchestrates the full pipeline to produce a polished video.
A configuration file has 10 top-level sections. Only metadata is required β every other section is optional and has sensible defaults.
metadata: # REQUIRED β title, description, author, version
voice: # TTS engine configuration
audio: # Background music, voice processing, effects
device_rendering: # 3D device mockup settings
video: # Intro, outro, transitions, watermark
subtitle: # Subtitle overlay styles and timing
scenarios: # Browser automation steps
pipeline: # Post-processing chain
output: # Export filenames, formats, social presets
analytics: # Engagement trackingConfig Format
DemoDSL accepts both YAML (.yaml / .yml) and JSON (.json) configuration files. The format is auto-detected from the file extension.
metadata:
title: "My Demo"
scenarios:
- name: "Tour"
url: "https://example.com"
steps:
- action: "navigate"
url: "https://example.com"{
"metadata": {
"title": "My Demo"
},
"scenarios": [{
"name": "Tour",
"url": "https://example.com",
"steps": [{
"action": "navigate",
"url": "https://example.com"
}]
}]
}demodsl init to generate a YAML template, or demodsl init -o demo.json for JSON.scenarios:
- name: "Tab Switching"
url: "https://fran-cois.github.io/demodsl/"
browser: "webkit"
viewport: { width: 1280, height: 720 }
steps:
- action: "scroll"
direction: "down"
pixels: 1800
narration: "Scroll to the code example section."
wait: 2.0
- action: "click"
locator:
type: "text"
value: "JSON"
narration: "Click JSON tab to see JSON format."
wait: 2.5
- action: "click"
locator:
type: "text"
value: "YAML"
narration: "Switch back to YAML."
wait: 2.0metadata
The only required top-level section. Provides descriptive information about the demo.
| Property | Type | Default | Description |
|---|---|---|---|
| title | string | β | Required. The demo title used in logs and output metadata. |
| description | string | null | null | Optional description for documentation. |
| author | string | null | null | Author name. |
| version | string | null | null | Version string (e.g. "2.0.0"). |
metadata:
title: "My Demo"title is the only truly required field in the entire config. Every other section and property has defaults or is optional.voice
Configures the Text-to-Speech engine used to generate narration audio from the narration field in steps.
| Property | Type | Default | Description |
|---|---|---|---|
| engine | "elevenlabs" | "google" | "azure" | "aws_polly" | "openai" | "custom" | "elevenlabs" | TTS provider to use. |
| voice_id | string | "josh" | Voice identifier. Provider-specific. |
| speed | float | 1.0 | Playback speed multiplier (0.5 = half speed, 2.0 = double). |
| pitch | int | 0 | Pitch adjustment in semitones. |
| reference_audio | string | null | Path to a .wav/.mp3 sample of your voice for voice cloning. Supported by: elevenlabs, coqui, cosyvoice, custom. |
voice:
engine: "elevenlabs"
voice_id: "josh"
speed: 1.0
pitch: 0Supported Engines
| Property | Type | Default | Description |
|---|---|---|---|
| elevenlabs | β | β | High-quality neural TTS. Requires ELEVENLABS_API_KEY. |
| openai | β | β | OpenAI TTS (tts-1-hd). Voices: alloy, echo, fable, onyx, nova, shimmer. Requires OPENAI_API_KEY. |
| β | β | Google Cloud TTS (Wavenet). Requires GOOGLE_APPLICATION_CREDENTIALS (service account JSON path). | |
| azure | β | β | Azure Cognitive Services Speech (Neural). Requires AZURE_SPEECH_KEY + AZURE_SPEECH_REGION. |
| aws_polly | β | β | Amazon Polly (Neural). Requires AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY. |
| cosyvoice | β | β | CosyVoice (Alibaba/Qwen). Local server. COSYVOICE_API_URL (default localhost:50000). |
| coqui | β | β | Coqui XTTS v2. Local inference via TTS library. COQUI_MODEL to override model. |
| piper | β | β | Piper TTS. Fast offline TTS via CLI. Requires PIPER_MODEL (path to .onnx). |
| local_openai | β | β | Any OpenAI-compatible local server (vLLM, LocalAI, AllTalkβ¦). LOCAL_TTS_URL. |
| espeak | β | β | eSpeak-NG β robotic vintage voice. Zero-dependency debug TTS. ESPEAK_BIN to override binary. |
| gtts | β | β | Google Translate TTS (gTTS) β free, no API key. pip install gtts. |
Voice IDs by Engine
Each engine uses its own voice naming convention. Set voice_id to a valid identifier for your chosen engine:
| Property | Type | Default | Description |
|---|---|---|---|
| elevenlabs | voice_id | "josh" | ElevenLabs voice ID. Find IDs at elevenlabs.io/voices. |
| openai | voice_id | "alloy" | One of: alloy, echo, fable, onyx, nova, shimmer. |
| voice_id | "en-US-Wavenet-D" | Full voice name (e.g. "en-US-Wavenet-D", "fr-FR-Wavenet-A"). | |
| azure | voice_id | "en-US-JennyNeural" | Full voice name. Must contain "Neural" for neural voices. |
| aws_polly | voice_id | "Matthew" | Polly voice name (capitalized). E.g. "Joanna", "Matthew", "LΓ©a". |
| cosyvoice | voice_id | "δΈζε₯³" | Speaker name supported by your CosyVoice model. |
| coqui | voice_id | "speaker.wav" | Path to a reference .wav for voice cloning, or a built-in speaker name. |
| piper | voice_id | "en_US-lessac-medium.onnx" | .onnx model path, or same as PIPER_MODEL. |
| local_openai | voice_id | "alloy" | Voice name supported by your local server. |
| espeak | voice_id | "en" | eSpeak voice/language code. E.g. "en", "fr", "de", "en+whisper". |
| gtts | voice_id | "en" | Language code (ISO 639-1). E.g. "en", "fr", "es", "ja". |
| custom | voice_id | "default" | Any string. Passed as-is in the JSON body to your endpoint. |
voice:
engine: "gtts"
voice_id: "en"
speed: 1.0
scenarios:
- name: "Narrated Tour"
steps:
- action: "navigate"
url: "https://fran-cois.github.io/demodsl/"
narration: >
Welcome to DemoDSL. Every step can include
a narration field converted to speech.
wait: 3.0
- action: "scroll"
direction: "down"
pixels: 600
narration: >
DemoDSL supports twelve voice engines,
from ElevenLabs to local Piper and eSpeak.
wait: 3.0voice:
engine: "custom"
voice_id: "my-voice"
speed: 1.0
# Environment variables:
# CUSTOM_TTS_URL=https://my-tts-server.com/synthesize
# CUSTOM_TTS_API_KEY=sk-... (optional)
# CUSTOM_TTS_RESPONSE_FORMAT=mp3 (mp3 or wav)custom engine POSTs a JSON body {text, voice_id, speed, pitch} to your endpoint and expects raw audio bytes in the response. This lets you integrate any TTS service with a simple HTTP wrapper.Voice Cloning (reference_audio)
Set reference_audio to a path to your own voice recording (.wav or .mp3) and DemoDSL will clone your voice on engines that support it. This way, the narration uses your voice instead of a stock voice.
| Property | Type | Default | Description |
|---|---|---|---|
| elevenlabs | β | Instant Voice Cloning | Uploads your sample via the Add Voice API. The cloned voice is cached for the session. |
| coqui | β | XTTS v2 speaker_wav | Passes reference audio directly to tts_to_file(speaker_wav=...). Zero-shot cloning. |
| cosyvoice | β | Zero-shot mode | Sends base64-encoded reference audio with mode="zero_shot" in the API payload. |
| custom | β | Forwarded in JSON | Adds a base64-encoded reference_audio field to the JSON payload for your endpoint. |
| openai | β | Not supported | OpenAI TTS does not support voice cloning. |
| β | Not supported | Google Cloud TTS does not support voice cloning. | |
| azure | β | Not supported | Azure TTS does not support voice cloning. |
| aws_polly | β | Not supported | Amazon Polly does not support voice cloning. |
| piper | β | Not supported | Piper uses pre-trained .onnx models. |
| espeak | β | Not supported | eSpeak is a formant synthesizer. |
| gtts | β | Not supported | gTTS uses Google Translate voices. |
voice:
engine: "coqui"
voice_id: "default"
reference_audio: "samples/my_voice.wav"
speed: 1.0voice:
engine: "elevenlabs"
voice_id: "josh" # fallback if cloning fails
reference_audio: "samples/my_voice.wav"
speed: 1.0reference_audio is set on an unsupported engine, a warning is logged and the field is ignored. The narration still generates using the standard voice_id.audio
Controls background music, voice processing, and audio effects applied during the mix_audio pipeline stage.
audio.background_music
| Property | Type | Default | Description |
|---|---|---|---|
| file | string | β | Required. Path to the audio file (MP3, WAV, OGG). |
| volume | float | 0.3 | Base volume (0.0β1.0). Converted to dB internally. |
| ducking_mode | "none" | "light" | "moderate" | "heavy" | "moderate" | Volume reduction during narration. |
| loop | bool | true | Loop the music to cover the entire video duration. |
Ducking modes control how much the background music volume drops when narration is playing:
| Property | Type | Default | Description |
|---|---|---|---|
| none | β | 0 dB | No ducking β music stays at full volume. |
| light | β | β6 dB | Subtle reduction. Music still audible. |
| moderate | β | β12 dB | Balanced. Default for most demos. |
| heavy | β | β20 dB | Near-silent music during speech. |
audio.voice_processing
| Property | Type | Default | Description |
|---|---|---|---|
| normalize | bool | true | Normalize audio loudness. |
| target_dbfs | int | -20 | Target loudness in dBFS (decibels relative to full scale). |
| remove_silence | bool | true | Strip leading/trailing silence from clips. |
| silence_threshold | int | -40 | dBFS below which audio is considered silence. |
| enhance_clarity | bool | false | Apply EQ boost to voice presence frequencies. |
| enhance_warmth | bool | false | Apply low-end EQ warmth to voice. |
| noise_reduction | bool | false | Remove background noise from recordings. |
audio.effects
| Property | Type | Default | Description |
|---|---|---|---|
| eq_preset | string | null | null | EQ preset name (e.g. "podcast", "broadcast"). |
| reverb_preset | string | null | null | Reverb preset (e.g. "small_room", "hall"). |
| compression | Compression | null | null | Dynamic range compression settings. |
audio.effects.compression
| Property | Type | Default | Description |
|---|---|---|---|
| threshold | int | -20 | Compression threshold in dB. |
| ratio | float | 3.0 | Compression ratio (e.g. 3.0 = 3:1). |
| attack | int | 5 | Attack time in milliseconds. |
| release | int | 50 | Release time in milliseconds. |
audio:
background_music:
file: "audio/bg.mp3"
volume: 0.3
ducking_mode: "moderate"
loop: true
voice_processing:
normalize: true
target_dbfs: -20
noise_reduction: true
effects:
eq_preset: "podcast"
reverb_preset: "small_room"
compression:
threshold: -20
ratio: 3.0
attack: 5
release: 50device_rendering Beta
Wraps the captured browser video inside a 3D device mockup frame, processed during the render_device_mockup pipeline stage.
| Property | Type | Default | Description |
|---|---|---|---|
| device | string | "iphone_15_pro" | Device model name. |
| orientation | "portrait" | "landscape" | "portrait" | Screen orientation. |
| quality | "low" | "medium" | "high" | "high" | Render quality level. |
| render_engine | "eevee" | "cycles" | "eevee" | Blender render engine. Eevee is faster, Cycles is more realistic. |
| camera_animation | string | "orbit_smooth" | Camera movement type around the device. |
| lighting | string | "studio" | Lighting preset. |
device_rendering:
device: "iphone_15_pro"
orientation: "portrait"
quality: "high"
render_engine: "eevee"
camera_animation: "orbit_smooth"
lighting: "studio"render_device_mockup pipeline stage is optional. If it fails (e.g. Blender not installed), the pipeline continues with the raw video.video
Controls video editing: intro/outro sequences, transitions between steps, watermark overlay, and output optimization. Processed during the edit_video pipeline stage.
video.intro
| Property | Type | Default | Description |
|---|---|---|---|
| duration | float | 3.0 | Intro duration in seconds. |
| type | string | "fade_in" | Animation type for the intro. |
| text | string | null | null | Main title text overlay. |
| subtitle | string | null | null | Subtitle text below the title. |
| font_size | int | 60 | Font size in pixels. |
| font_color | string | "#FFFFFF" | Font color (hex). |
| background_color | string | "#1a1a1a" | Background color (hex). |
video.transitions
| Property | Type | Default | Description |
|---|---|---|---|
| type | "crossfade" | "slide" | "zoom" | "dissolve" | "crossfade" | Transition style between steps. |
| duration | float | 0.5 | Transition duration in seconds. |
video.watermark
| Property | Type | Default | Description |
|---|---|---|---|
| image | string | β | Required. Path to the watermark image (PNG recommended). |
| position | "top_left" | "top_right" | "bottom_left" | "bottom_right" | "center" | "bottom_right" | Watermark position on the video. |
| opacity | float | 0.7 | Watermark opacity (0.0β1.0). |
| size | int | 100 | Watermark size in pixels (longest side). |
video.outro
| Property | Type | Default | Description |
|---|---|---|---|
| duration | float | 4.0 | Outro duration in seconds. |
| type | string | "fade_out" | Animation type for the outro. |
| text | string | null | null | Main text overlay. |
| subtitle | string | null | null | Subtitle text. |
| cta | string | null | null | Call-to-action text (e.g. "Get Started"). |
video.optimization
| Property | Type | Default | Description |
|---|---|---|---|
| target_size_mb | int | null | null | Target file size. Bitrate is auto-calculated. |
| web_optimized | bool | true | Move moov atom for fast web streaming start. |
| compression_level | "low" | "balanced" | "high" | "balanced" | Encoding compression preset. |
video:
intro:
duration: 3.0
type: "fade_in"
text: "Product Name"
subtitle: "v2.0"
font_size: 60
font_color: "#FFFFFF"
background_color: "#1a1a1a"
transitions:
type: "crossfade"
duration: 0.5
watermark:
image: "logo.png"
position: "bottom_right"
opacity: 0.7
size: 100
outro:
duration: 4.0
type: "fade_out"
text: "Try it today!"
cta: "Get Started"
optimization:
target_size_mb: 50
web_optimized: true
compression_level: "balanced"Recording Quality
DemoDSL uses two recording backends depending on the browser. When browser: "chrome" is set, a high-quality CDP screenshot pipeline captures frames via a direct DevTools Protocol connection β completely bypassing Playwright's low-bitrate VP8 screencast. For WebKit and Firefox, an spp + hqdn3d deblocking filter is applied during export to smooth VP8 artefacts.
| Native (VP8) | CDP (H.264) | |
|---|---|---|
| Recording method | VP8 screencast | CDP screenshots |
| VP8 artefacts | Yes (deblocked) | None |
| File size | ~330 KB | ~70 KB |
| Total time | ~13 s | ~13 s |
| Supported browsers | All | Chromium only |
browser: "chrome" in your scenario to automatically use CDP recording β no extra config needed. WebKit and Firefox fall back to native VP8 with post-processing deblocking.subtitle
Burns styled subtitles into the video, synced word-by-word to narration timing. Subtitles are generated as ASS files and composited via ffmpeg. Can be set at the top level (applies to all scenarios) or per-scenario.
| Property | Type | Default | Description |
|---|---|---|---|
| enabled | bool | true | Enable subtitle overlay. |
| style | "classic" | "tiktok" | "color" | "word_by_word" | "typewriter" | "karaoke" | "bounce" | "cinema" | "highlight_line" | "fade_word" | "emoji_react" | "classic" | Subtitle display style (see table below). |
| speed | "slow" | "normal" | "fast" | "tiktok" | "normal" | Display speed preset β controls words per second. |
| font_size | int | 48 | Font size in pixels. |
| font_family | string | "Arial" | Font family name. |
| font_color | string | "#FFFFFF" | Primary text color (hex). |
| background_color | string | "rgba(0,0,0,0.6)" | Background fill behind text (hex or rgba). |
| position | "bottom" | "center" | "top" | "bottom" | Vertical position on screen. |
| highlight_color | string | "#FFD700" | Accent color for highlighted words. |
| max_words_per_line | int | 8 | Maximum words per subtitle line. |
| animation | "none" | "fade" | "pop" | "slide" | "none" | Text entrance animation. |
Subtitle Styles
Each style preset configures defaults for font size, position, colors, and animation. User values always override the preset.
| Property | Type | Default | Description |
|---|---|---|---|
| classic | 42px, bottom, white on dark box | β | Traditional subtitle bar at the bottom. Clean, readable. |
| tiktok | 64px, center, bold word-by-word | β | Large centered text, one highlighted word at a time. Social media style. |
| color | 48px, bottom, word highlight | β | Full line visible, current word changes to accent color. |
| word_by_word | 56px, center, single word | β | One word at a time, centered. Maximum emphasis. |
| typewriter | 44px, bottom, green on black | β | Characters appear letter by letter. Terminal/hacker aesthetic. |
| karaoke | 52px, bottom, progressive fill | β | Words fill with color progressively, karaoke-bar style. |
| bounce | 60px, center, scale animation | β | Words pop in with a bounce scale effect (120% β 100%). |
| cinema | 38px, bottom, italic serif | β | Elegant italic serif font with shadow. Film subtitle look. |
| highlight_line | 46px, bottom, dim/bright | β | Current line is bright white, rest stays dimmed gray. |
| fade_word | 50px, center, fade-in | β | Each word fades in with a smooth alpha transition. |
| emoji_react | 52px, bottom, emoji prefix | β | Auto-picks a contextual emoji based on narration keywords. |
Style Demos
Each video below shows a subtitle style in action on short sample narration text.
subtitle:
style: "classic"
speed: "normal"
font_size: 42
position: "bottom"subtitle:
style: "tiktok"
speed: "fast"
font_size: 64
position: "center"
highlight_color: "#FFD700"subtitle:
style: "color"
speed: "normal"
highlight_color: "#00FF88"subtitle:
style: "word_by_word"
speed: "normal"
font_size: 56
position: "center"subtitle:
style: "typewriter"
font_color: "#00FF00"
background_color: "rgba(0,0,0,0.8)"subtitle:
style: "karaoke"
highlight_color: "#FF4444"
position: "bottom"subtitle:
style: "bounce"
font_size: 60
position: "center"subtitle:
style: "cinema"
font_family: "Georgia"
font_size: 38subtitle:
style: "highlight_line"
highlight_color: "#FFFFFF"
font_color: "#888888"subtitle:
style: "fade_word"
font_size: 50
position: "center"subtitle:
style: "emoji_react"
font_size: 52
highlight_color: "#FFD700"Speed Presets
| Property | Type | Default | Description |
|---|---|---|---|
| slow | 1.5 wps | β | Slow pace β good for technical content or tutorials. |
| normal | 2.5 wps | β | Standard reading pace. |
| fast | 4.0 wps | β | Fast pace for experienced viewers. |
| tiktok | 6.0 wps | β | Very fast β matches TikTok/Reels pacing. |
subtitle:
enabled: true
style: "tiktok"
speed: "fast"
font_size: 64
highlight_color: "#FFD700"
position: "center"
scenarios:
- name: "Demo"
url: "https://myapp.com"
steps:
- action: "navigate"
url: "https://myapp.com"
narration: "This text becomes a subtitle!"
pipeline:
- generate_narration: {}
- burn_subtitles: {}
- edit_video: {}scenarios:
- name: "Intro"
url: "https://myapp.com"
subtitle:
enabled: true
style: "cinema"
speed: "slow"
font_family: "Georgia"
steps:
- action: "navigate"
url: "https://myapp.com"
narration: "An elegant introduction."
- name: "Features"
subtitle:
style: "bounce"
speed: "fast"
steps:
- action: "scroll"
direction: "down"
pixels: 500
narration: "Fast-paced feature showcase!"burn_subtitles: {} to your pipeline to enable subtitle rendering. Subtitles are generated from the narration field of each step β no separate subtitle file needed.emoji_react style automatically picks emojis based on narration keywords: π for "click", π for "scroll", β‘ for "fast", π¬ for "video", and more. A π¬ default is used when no keyword matches.languages
Generate multi-track audio narration and multi-language subtitles in a single render. The same scenario is recorded once, then narration is synthesised in every requested language and either embedded as additional tracks in the final MP4 or written as sidecar files (narration_{lang}.mp3, subtitles_{lang}.ass).
overview
| Property | Type | Default | Description |
|---|---|---|---|
| default | string | "en" | Source language used by step narration: fields. BCP-47 (e.g. en, fr, en-US). |
| targets | list[string] | [] | Additional languages to render. |
| voices | dict[str, VoiceConfig] | {} | Optional per-language voice override (engine, voice_id, etc.). |
| embed | bool | true | When true, mux all audio + subtitle tracks into a single MP4. When false, write sidecar files next to the output. |
| burn_default | bool | false | When true, also burn the default-language subtitles into the picture (useful for social clips). |
| audio_only | bool | false | Only generate per-language audio tracks (no subtitle tracks). |
| subtitle_only | bool | false | Only generate per-language subtitle tracks (no audio tracks). |
metadata:
title: Multilang demo
voice:
engine: gtts
voice_id: fr
languages:
default: fr
targets: [en, de]
embed: true
voices:
en: { engine: gtts, voice_id: en }
de: { engine: gtts, voice_id: de }
scenarios:
- name: tour
url: https://example.com
steps:
- narration: Bienvenue sur notre site.
narrations:
en: Welcome to our website.
de: Willkommen auf unserer Website.
action: scroll
amount: 400per-step translations
Each step gains an optional narrations mapping. Keys are BCP-47 language codes; values are the translated narration text. When a translation is missing, the engine falls back to the basenarration field so a partial translation never blocks the render.
steps:
- narration: Cliquez ici pour commencer.
narrations:
en: Click here to get started.
de: Klicken Sie hier, um zu beginnen.
action: click
target: "#start"per-language voices
The languages.voices map lets each language use its own TTS engine, voice id, speed, etc. Any field omitted in the override inherits from the top-level voice block.
voice:
engine: elevenlabs
voice_id: french_voice_id
languages:
default: fr
targets: [en, ja]
voices:
en:
engine: elevenlabs
voice_id: english_voice_id
ja:
engine: openai
voice_id: novaembedded vs sidecar
With embed: true (default), the final MP4 contains one AAC audio track per language (with proper language= metadata, default-disposition on the source language) and one mov_text subtitle track per language. Players such as VLC, QuickTime, YouTube and Vimeo expose them as selectable tracks.
With embed: false, the engine still produces a single MP4 (default-language audio burnt-in) plus sidecar files:narration_en.mp3, subtitles_en.ass, and so on for each target language. Useful when uploading to platforms that require external caption files.
languages is active, the regular subtitle burn-in is skipped to keep the picture clean β set languages.burn_default: true to re-enable burning of the default-language subtitles.CLI usage
# Standard render β picks up languages: from the YAML
demodsl run demo_multilang.yaml
# Inspect the planned tracks without rendering
demodsl run demo_multilang.yaml --dry-runscenarios
A list of browser automation scenarios. Each scenario captures a recording from a web application. Multiple scenarios are concatenated in the final video.
scenarios:
- name: "Landing Page Overview"
url: "https://fran-cois.github.io/demodsl/"
browser: "webkit"
steps:
- action: "navigate"
url: "https://fran-cois.github.io/demodsl/"
narration: "Scenario one: the landing page."
wait: 2.0
- action: "scroll"
direction: "down"
pixels: 800
narration: "Scroll through features."
wait: 2.0
- name: "Docs Deep Dive"
url: "https://fran-cois.github.io/demodsl/docs"
browser: "webkit"
steps:
- action: "navigate"
url: "https://fran-cois.github.io/demodsl/docs"
narration: "Scenario two: the docs page."
wait: 2.0| Property | Type | Default | Description |
|---|---|---|---|
| name | string | β | Required. Human-readable scenario name. |
| url | string | β | Required. Base URL for the scenario. |
| browser | "chrome" | "firefox" | "webkit" | "chrome" | Browser engine (Playwright). |
| viewport | Viewport | 1920Γ1080 | Browser viewport dimensions. |
| cursor | CursorConfig | null | Visible cursor overlay mode. Shows mouse movement and click effects. |
| glow_select | GlowSelectConfig | null | Apple Intelligence-style animated glow highlight around clicked elements. |
| popup_card | PopupCardConfig | null | Popup card overlay synced with narration. Shows text and progressive item reveals. |
| avatar | AvatarConfig | null | Animated avatar overlay synced with narration audio. Free (animated) or paid (D-ID, HeyGen) providers. |
| subtitle | SubtitleConfig | null | Subtitle overlay config (per-scenario override). Overrides top-level subtitle settings. |
| steps | Step[] | [] | List of automation steps. |
scenarios[].viewport
| Property | Type | Default | Description |
|---|---|---|---|
| width | int | 1920 | Viewport width in pixels. |
| height | int | 1080 | Viewport height in pixels. |
1920Γ1080 (Full HD), 1280Γ720 (HD), 390Γ844 (iPhone 14), 1024Γ768 (tablet).scenarios:
- name: "Main Demo"
url: "https://myapp.com"
browser: "chrome"
viewport:
width: 1920
height: 1080
steps:
- action: "navigate"
url: "https://myapp.com"scenarios[].cursor
Injects a visible fake cursor overlay captured in the recorded video. The cursor animates towards each target element before click/type actions and plays a visual effect on click.
| Property | Type | Default | Description |
|---|---|---|---|
| visible | bool | true | Whether the cursor is shown. |
| style | "dot" | "pointer" | "dot" | Cursor shape. Dot = circle, pointer = arrow SVG. |
| color | string | "#ef4444" | Cursor color (hex). |
| size | int | 20 | Cursor size in pixels. |
| click_effect | "ripple" | "pulse" | "none" | "ripple" | Visual effect on click. |
| smooth | float | 0.4 | Animation duration in seconds (ease-out). |
scenarios:
- name: "Cursor Showcase"
url: "https://fran-cois.github.io/demodsl/"
browser: "webkit"
cursor:
visible: true
style: "dot"
color: "#ef4444"
size: 20
click_effect: "ripple"
smooth: 0.4
steps:
- action: "click"
locator:
type: "text"
value: "Get Started"
narration: "Cursor moves to the button and clicks."
wait: 2.0
- action: "click"
locator:
type: "text"
value: "Documentation"
narration: "Smooth animation to each target."
wait: 2.0scenarios[].glow_select
Apple Intelligence-style animated gradient glow that highlights elements before click and type actions. The glow pulses with a rotating hue and fades out after the action.
| Property | Type | Default | Description |
|---|---|---|---|
| enabled | bool | true | Whether glow-select is active. |
| colors | string[] | ["#a855f7","#6366f1","#ec4899","#a855f7"] | Gradient color stops for the glow border. |
| duration | float | 0.8 | Hue rotation cycle duration in seconds. |
| padding | int | 8 | Extra padding around the element bounding box. |
| border_radius | int | 12 | Border radius of the glow overlay. |
| intensity | float | 0.9 | Glow opacity (0β1). |
cursor and glow_select for a polished demo experience. The cursor animates into the glowing element, then clicks.scenarios:
- name: "Glow Select Showcase"
url: "https://fran-cois.github.io/demodsl/"
browser: "webkit"
cursor:
style: "dot"
color: "#a855f7"
glow_select:
enabled: true
colors: ["#a855f7","#6366f1","#ec4899","#a855f7"]
duration: 0.8
padding: 8
border_radius: 12
steps:
- action: "click"
locator:
type: "text"
value: "Get Started"
narration: "Glow appears around the button."
wait: 2.0
- action: "click"
locator:
type: "text"
value: "Documentation"
narration: "Each element gets the glow treatment."
wait: 2.0scenarios[].popup_card
The popup_card mode injects styled overlay cards that appear synced with narration. When a step has a card field with a list of items, they are revealed progressively β each bullet appears one by one, timed to match the narrator.
| Property | Type | Default | Description |
|---|---|---|---|
| enabled | boolean | true | Enable the popup card overlay. |
| position | "bottom-right" | "bottom-left" | "top-right" | "top-left" | "bottom-center" | "top-center" | "bottom-right" | Card position on screen. |
| theme | "glass" | "dark" | "light" | "gradient" | "glass" | Visual theme for the card. |
| max_width | number | 420 | Maximum card width in pixels. |
| animation | "slide" | "fade" | "scale" | "slide" | Entrance/exit animation style. |
| accent_color | string | "#818cf8" | Accent color for bullets and progress bar. |
| show_icon | boolean | true | Show emoji icon in the card header. |
| show_progress | boolean | true | Show a progress bar synced with narration duration. |
Each step can include a card object with:
| Property | Type | Default | Description |
|---|---|---|---|
| card.title | string | null | Card title text. |
| card.body | string | null | Card body/description text. |
| card.items | string[] | null | Bullet-point list. Revealed progressively when narration is present. |
| card.icon | string | null | Emoji or short text shown in the header (e.g. "π"). |
scenarios:
- name: "Card Overlay Tour"
url: "https://fran-cois.github.io/demodsl/"
browser: "webkit"
popup_card:
enabled: true
position: "bottom-right"
theme: "glass"
animation: "slide"
steps:
- action: "navigate"
url: "https://fran-cois.github.io/demodsl/"
narration: "Welcome to DemoDSL."
card:
title: "DemoDSL"
body: "A DSL-driven automated demo generator."
icon: "π¬"
- action: "scroll"
direction: "down"
pixels: 600
narration: "Six integrated phases."
card:
title: "Six Phases"
icon: "β‘"
items:
- "Browser Automation"
- "Voice Narration"
- "Visual Effects"
- "Video Composition"
- "Audio Mixing"
- "Multi-format Export"scenarios[].avatar
An animated avatar overlay that reacts to narration audio in real time. The avatar lip-syncs to TTS amplitude and is composited on top of the video at the chosen corner. Two provider types are available: animated (free, Pillow-generated) and API-based (D-ID, HeyGen, SadTalker β paid or self-hosted).
| Property | Type | Default | Description |
|---|---|---|---|
| enabled | bool | true | Whether the avatar overlay is active. |
| provider | "animated" | "d-id" | "heygen" | "sadtalker" | "animated" | Avatar generation engine. Animated is free, others require an API key. |
| image | string | null | null | Path, URL (http/https), or preset name ("default", "robot", "circle"). URLs are downloaded and cached locally. |
| position | "bottom-right" | "bottom-left" | "top-right" | "top-left" | "bottom-right" | Corner position of the avatar on the video. |
| size | int | 120 | Avatar diameter in pixels. |
| style | "bounce" | "waveform" | "pulse" | "equalizer" | "xp_bliss" | "clippy" | "visualizer" | "bounce" | Animation style (animated provider only). See table below. |
| shape | "circle" | "rounded" | "square" | "circle" | Avatar outline shape. |
| background | string | "rgba(0,0,0,0.5)" | Background fill behind the avatar (CSS color or rgba). |
| background_shape | "square" | "circle" | "rounded" | "square" | Shape of the avatar background. Use circle for a fully round overlay. |
| api_key | string | null | null | API key for paid providers. Supports env-var syntax: "${D_ID_API_KEY}". |
| show_subtitle | bool | false | Display narration text below the avatar box during playback. |
| subtitle_font_size | int | 18 | Font size for the avatar subtitle text. |
| subtitle_font_color | string | "#FFFFFF" | Font color for the avatar subtitle. |
| subtitle_bg_color | string | "rgba(0,0,0,0.7)" | Background color for the avatar subtitle box. |
Animation Styles (free)
These styles are available with the animated provider. Each generates a different visual animation from the narration audio waveform.
| Property | Type | Default | Description |
|---|---|---|---|
| bounce | β | β | A circle that scales up and down with audio amplitude. Simple and clean. |
| waveform | β | β | Radial wave ring that expands from the center with audio pulses. |
| pulse | β | β | Glowing disc with a pulsing aura effect. Subtle and professional. |
| equalizer | β | β | Neon equalizer bars (Windows XP era). Retro audio visualizer look. |
| xp_bliss | β | β | Windows XP Bliss-inspired hills, sun and floating music notes. |
| clippy | β | β | Animated paperclip with googly eyes. A nostalgic Microsoft Office mascot. |
| visualizer | β | β | Circular spectrum analyzer with rainbow gradient bars. |
| pacman | β | β | Pac-Man chomping dots with a colorful ghost. Arcade nostalgia. |
| space_invader | β | β | Pixel-art Space Invaders alien with shields and cannon. Retro arcade. |
| mario_block | β | β | Bouncing Mario "?" block that pops coins on loud audio. Iconic gaming. |
| nyan_cat | β | β | Pixel-art cat on a rainbow trail with scrolling stars. Internet classic. |
| matrix | β | β | Cascading green Matrix code rain with avatar in the center. |
| pickle_rick | β | β | Pickle Rick with rat limbs, expressive eyes, and yelling mouth. Wubba lubba dub dub! |
| chrome_dino | β | β | Chrome's offline T-Rex dinosaur with desert, cacti, and 'No internet' message. |
| marvin | β | β | Marvin the Paranoid Android with sad eyes and depressive quotes. H2G2 classic. |
| mac128k | β | β | Macintosh 128K with expressive face on green screen. Retro computing icon. |
| floppy_disk | β | β | 3.5" floppy disk with face, label, and '1.44 MB' nostalgia. |
| bsod | β | β | Blue Screen of Death with progressive error text and sad :( emoticon. |
| bugdroid | β | β | Android's green Bugdroid robot with waving arms and antennae. |
| qr_code | β | β | QR code pattern with expressive eyes in the center. 'SCAN ME!' |
| gpu_sweat | β | β | Sweating GPU with spinning fan, temperature display, and sweat drops. |
| rubber_duck | β | β | Yellow rubber duck debugging companion with judgmental speech bubbles. |
| fail_whale | β | β | Twitter's Fail Whale carried by birds. 'Twitter is over capacity.' |
| server_rack | β | β | Overheating server rack with red eyes, smoke, blinking LEDs, and temp bar. |
| cursor_hand | β | β | Windows pointing hand cursor that bosses you around. 'Click here!' |
| vhs_tape | β | β | VHS cassette with spinning reels, label, and scanlines. 'Be kind, rewind!' |
| cloud | β | β | Cute but capricious cloud with rain, lightning, and data ownership jokes. |
| wifi_low | β | β | Wi-Fi icon with one bar that stutters and cuts off mid-senβ |
| nokia3310 | β | β | The indestructible Nokia 3310 with Snake and warrior quotes. |
| cookie | β | β | Browser cookie with creepy eyes that knows your browsing habits. |
| modem56k | β | β | 56k modem with blinking LEDs, dial-up sounds, and green waveform. |
| esc_key | β | β | Panicked Escape key trying to break free β sweat drops & frantic quotes. |
| sad_mac | β | β | Classic dead Macintosh with X-eyed icon, error codes & hardware trauma. |
| usb_cable | β | β | Tangled USB-A cable frustrated by 3-try insertion. Always wrong side. |
| hourglass | β | β | Windows hourglass that speaks very slowly while sand trickles down. |
| firewire | β | β | Forgotten FireWire 400 cable living in a drawer, reminiscing glory days. |
| ai_hallucinated | β | β | Glitching robot mixing facts with recipes β spiral eye & glitch lines. |
| tamagotchi | β | β | Abandoned pixel egg pet asking why you haven't fed it since 1998. |
| lasso_tool | β | β | Obsessive Photoshop selection tool with marching ants on checkerboard. |
| battery_low | β | β | Battery at 1% β red, blinking, talks fast then cuts off abruptly. |
| incognito | β | β | Chrome Incognito detective with fedora & glasses. Sees nothing. |
| rainbow_wheel | β | β | Mac spinning rainbow wheel β hypnotic, unstoppable, rage-inducing. |
| error_404 | β | β | Lost 404 page wandering around with question marks, literally unfindable. |
| google_blob | β | β | Googleβs old melted blob emoji, nostalgic for its expressive past. |
| bit | β | β | Binary bit (0/1) with matrix rain β answers only Yes or No. |
| pc_fan | β | β | Spinning PC fan screaming at full RPM when you open 3 Chrome tabs. |
| captcha | β | β | Twisted, illegible CAPTCHA yelling PROVE YOUβRE HUMAN! |
| bluetooth | β | β | Bluetooth logo desperately searching, always failing to pair. |
| registry_key | β | β | Windows Registry key β bureaucratic folder controlling everything. |
| high_ping | β | β | 999ms ping avatar with buffering spinner, responds 10 sec late. |
| scratched_cd | β | β | Scratched CD-ROM with rainbow reflections, st-st-stuttering speech. |
| kermit | β | β | Kermit sipping tea β but that's none of my business. |
| this_is_fine | β | β | Dog sitting in flames saying 'This is fine.' |
| trollface | β | β | Classic Trollface with a mocking grin β Problem? |
| no_idea_dog | β | β | Golden retriever at a computer β I have no idea what I'm doing. |
| surprised_pikachu | β | β | Pikachu with open mouth β feigned surprise at the obvious. |
| distracted_bf | β | β | Distracted boyfriend looking at the shiny new framework. |
| success_kid | β | β | Kid with clenched fist celebrating small victories. |
| expanding_brain | β | β | Luminous expanding brain β transcended enlightenment. |
| doge | β | β | Shiba Inu with floating 'such wow' 'much code' words. |
| wiki_globe | β | β | Wikipedia puzzle globe with glasses β [citation needed]. |
bounce
Scales up/down with audio
waveform
Radial wave ring
pulse
Glowing aura effect
equalizer
Neon retro bars
xp_bliss
Windows XP hills & notes
clippy
Animated paperclip mascot
visualizer
Circular spectrum analyzer
pacman
Arcade chomper & ghost
space_invader
Pixel-art alien arcade
mario_block
Bouncing "?" block with coins
nyan_cat
Rainbow trail pixel cat
matrix
Cascading green code rain
pickle_rick
Pickle Rick with rat limbs
chrome_dino
Chrome's offline T-Rex
marvin
Paranoid Android, depressive quotes
mac128k
Macintosh 128K retro green screen
floppy_disk
3.5" floppy with 1.44 MB nostalgia
bsod
Blue Screen of Death :(
bugdroid
Android's green robot
qr_code
QR code pattern β SCAN ME!
gpu_sweat
Sweating GPU with spinning fan
rubber_duck
Debugging companion duck
fail_whale
Twitter's over capacity whale
server_rack
Overheating server with smoke
cursor_hand
Bossy pointing hand cursor
vhs_tape
VHS cassette β Be kind, rewind!
cloud
Cute capricious cloud with rain
wifi_low
One bar Wi-Fi, cuts off mid-senβ
nokia3310
Indestructible Nokia with Snake
cookie
Creepy browser cookie that knows all
modem56k
56k modem β psshhh-kkkk-ding
esc_key
Panicked Esc key β LET ME OUT!
sad_mac
Dead Macintosh with X eyes
usb_cable
Tangled USB β wrong side, again
hourglass
Slow hourglass β Pleaseβ¦ waitβ¦
firewire
Forgotten cable in a drawer
ai_hallucinated
Glitching robot mixing facts
tamagotchi
Abandoned pet since 1998
lasso_tool
Obsessive selection tool
battery_low
1% battery β dying fast
incognito
Chrome detective sees nothing
rainbow_wheel
Mac spinning wheel of doom
error_404
Lost page, literally unfindable
google_blob
Old melted blob emoji, nostalgic
bit
Binary 0/1 β answers Yes or No
pc_fan
Screaming fan β MAX RPM!
captcha
PROVE YOU'RE HUMAN!
bluetooth
Desperately searching, pairing failed
registry_key
Bureaucratic folder, controls all
high_ping
999ms β responds 10 sec late
scratched_cd
Sk-sk-skip! Stuttering CD
kermit
None of my business⦠*sips tea*
this_is_fine
Dog in flames β everything is fine
trollface
Problem? U mad bro?
no_idea_dog
I have no idea what I'm doing
surprised_pikachu
Feigned surprise :O
distracted_bf
Looking at the new framework
success_kid
Fist pump! It compiled!
expanding_brain
Transcended the codebase
doge
Such code. Much wow. Very deploy.
wiki_globe
[citation needed]
scenarios:
- name: "Demo with Avatar"
url: "https://myapp.com"
avatar:
enabled: true
provider: "animated"
style: "equalizer"
position: "bottom-right"
size: 100
shape: "circle"
background: "rgba(0,0,0,0.6)"
steps:
- action: "navigate"
url: "https://myapp.com"
narration: "The avatar reacts to this narration."
wait: 2.0scenarios:
- name: "Demo with Custom Avatar"
url: "https://myapp.com"
avatar:
enabled: true
provider: "animated"
image: "https://avatars.githubusercontent.com/u/22380190?v=4"
style: "bounce"
position: "bottom-right"
size: 120
shape: "circle"
steps:
- action: "navigate"
url: "https://myapp.com"
narration: "My avatar uses an image loaded from a URL."
wait: 2.0scenarios:
- name: "Demo with Talking Head"
url: "https://myapp.com"
avatar:
enabled: true
provider: "d-id"
image: "presenter.jpg"
position: "bottom-left"
size: 200
api_key: "${D_ID_API_KEY}"
steps:
- action: "navigate"
url: "https://myapp.com"
narration: "A real talking-head avatar powered by D-ID."
wait: 3.0avatar with cursor and glow_select for a fully polished demo experience. Add composite_avatar to your pipeline to enable the overlay.scenarios:
- name: "Demo with Avatar Subtitles"
url: "https://myapp.com"
avatar:
enabled: true
provider: "animated"
style: "clippy"
position: "bottom-right"
size: 100
show_subtitle: true
subtitle_font_size: 16
steps:
- action: "navigate"
url: "https://myapp.com"
narration: "Narration text appears right below the avatar."
wait: 2.0subtitle:
enabled: true
style: "cinema"
speed: "normal"
scenarios:
- name: "Site Tour"
url: "https://fran-cois.github.io/demodsl/"
avatar:
enabled: true
provider: "animated"
style: "clippy"
position: "bottom-right"
size: 100
shape: "circle"
steps:
- action: "navigate"
url: "https://fran-cois.github.io/demodsl/"
narration: "The avatar pulses to each narration."
wait: 2.0
- action: "scroll"
direction: "down"
pixels: 600
narration: "Subtitles appear in cinema style."
wait: 2.0
pipeline:
- composite_avatar: {}
- burn_subtitles: {}
- edit_video: {}
- mix_audio: {}
- optimize: {}steps
Steps define individual browser actions within a scenario. Each step has an action type and action-specific fields. All steps also support optional narration, wait, and effects.
Common Fields (all actions)
| Property | Type | Default | Description |
|---|---|---|---|
| action | "navigate" | "click" | "type" | "scroll" | "wait_for" | "screenshot" | β | Required. The action type. |
| narration | string | null | null | Text-to-speech narration played during this step. |
| wait | float | null | null | Seconds to wait after the action completes. |
| effects | Effect[] | null | Visual effects to apply during this step. |
| card | CardContent | null | null | Popup card content (title, body, items, icon). Shown synced with narration when popup_card mode is enabled. |
action: "navigate"
Navigate the browser to a URL.
| Property | Type | Default | Description |
|---|---|---|---|
| url | string | β | Required. The URL to navigate to. |
- action: "navigate"
url: "https://myapp.com/dashboard"
narration: "Let's visit the dashboard."
wait: 2.0action: "click"
Click on an element identified by a locator.
| Property | Type | Default | Description |
|---|---|---|---|
| locator.type | "css" | "id" | "xpath" | "text" | "css" | Locator strategy. |
| locator.value | string | β | Required. The selector/identifier. |
- action: "click"
locator:
type: "css"
value: "#submit-btn"
narration: "Click submit."
effects:
- type: "highlight"
color: "#FFD700"scenarios:
- name: "Click Interactions"
url: "https://fran-cois.github.io/demodsl/"
browser: "webkit"
viewport: { width: 1280, height: 720 }
steps:
- action: "navigate"
url: "https://fran-cois.github.io/demodsl/"
narration: "Open the documentation site."
wait: 2.0
- action: "click"
locator:
type: "text"
value: "Get Started"
narration: "Click Get Started via text locator."
wait: 1.5
- action: "click"
locator:
type: "text"
value: "GitHub β"
narration: "Click the GitHub link."
wait: 2.0action: "type"
Type text into an input field.
| Property | Type | Default | Description |
|---|---|---|---|
| locator.type | "css" | "id" | "xpath" | "text" | "css" | Locator strategy. |
| locator.value | string | β | Required. The selector. |
| value | string | β | Required. The text to type. |
- action: "type"
locator:
type: "id"
value: "email"
value: "user@example.com"
effects:
- type: "typewriter"
speed: 0.1action: "scroll"
Scroll the page in a direction.
| Property | Type | Default | Description |
|---|---|---|---|
| direction | "up" | "down" | "left" | "right" | "down" | Scroll direction. |
| pixels | int | 300 | Number of pixels to scroll. |
- action: "scroll"
direction: "down"
pixels: 500
narration: "Scrolling to see more features."scenarios:
- name: "Navigate and Scroll"
url: "https://fran-cois.github.io/demodsl/"
browser: "webkit"
viewport: { width: 1280, height: 720 }
steps:
- action: "navigate"
url: "https://fran-cois.github.io/demodsl/"
narration: "Navigate to the target URL."
wait: 2.0
- action: "scroll"
direction: "down"
pixels: 400
narration: "Scroll down 400 pixels."
wait: 1.5
- action: "scroll"
direction: "down"
pixels: 600
narration: "Continue scrolling."
wait: 1.5
- action: "scroll"
direction: "up"
pixels: 300
narration: "Scroll back up."
wait: 1.5action: "wait_for"
Wait for an element to appear in the DOM.
| Property | Type | Default | Description |
|---|---|---|---|
| locator.type | "css" | "id" | "xpath" | "text" | "css" | Locator strategy. |
| locator.value | string | β | Required. The selector. |
| timeout | float | 5.0 | Maximum wait time in seconds. |
- action: "wait_for"
locator:
type: "css"
value: ".dashboard-loaded"
timeout: 10.0
narration: "Waiting for the dashboard to load."timeout seconds, the step throws an error and the scenario stops.steps:
- action: "navigate"
url: "https://fran-cois.github.io/demodsl/docs"
narration: "Navigate to the docs page."
wait: 2.0
- action: "wait_for"
locator:
type: "css"
value: "nav a"
timeout: 5.0
narration: "Wait for the sidebar nav to load."
wait: 1.5
- action: "click"
locator:
type: "css"
value: "a[href='#effects']"
narration: "Click the effects link."
wait: 2.0
- action: "wait_for"
locator:
type: "css"
value: "#effects"
timeout: 5.0
narration: "Wait for effects heading to appear."
wait: 1.5action: "screenshot"
Capture a screenshot of the current page.
| Property | Type | Default | Description |
|---|---|---|---|
| filename | string | "screenshot.png" | Output filename. Saved to the workspace frames directory. |
- action: "screenshot"
filename: "final_state.png"
narration: "Here's the final result."scenarios:
- name: "Mobile Capture"
url: "https://fran-cois.github.io/demodsl/"
browser: "webkit"
viewport:
width: 390
height: 844
steps:
- action: "navigate"
url: "https://fran-cois.github.io/demodsl/"
narration: "Load in mobile viewport, 390x844."
wait: 2.0
- action: "scroll"
direction: "down"
pixels: 400
narration: "See the responsive layout."
wait: 1.5
- action: "screenshot"
filename: "mobile_capture.png"
narration: "Take a screenshot."
wait: 1.0Locator Types
Four locator strategies are available for identifying elements:
| Property | Type | Default | Description |
|---|---|---|---|
| css | β | β | CSS selector. Examples: "#id", ".class", "button[type=submit]" |
| id | β | β | Element ID (shorthand for #id). Example: "email-input" |
| xpath | β | β | XPath expression. Example: "//div[@class='card']" |
| text | β | β | Visible text content. Example: "Sign Up" |
css selectors for stability. Use text locators for buttons or links where the visible text is more stable than the CSS structure.steps:
- action: "click"
locator:
type: "text"
value: "Get Started"
narration: "Text locator: click by visible text."
wait: 2.0
- action: "click"
locator:
type: "text"
value: "Documentation"
narration: "Text locator: click Documentation."
wait: 2.0
- action: "click"
locator:
type: "css"
value: "a[href='#pipeline']"
narration: "CSS locator: jump to pipeline."
wait: 2.0
- action: "scroll"
direction: "down"
pixels: 400
narration: "Supports css, id, xpath, and text."
wait: 1.5effects
43 visual effects are available, split into five categories: browser effects (11 β injected as CSS/JS during capture), cursor trail variants (6 β animated trails following the cursor), fun / celebration effects (6 β confetti-style canvas overlays), post-processing effects (7 β applied to the rendered video via MoviePy), and camera & cinematic effects (13 β advanced camera movements and cinematic post-processing). Effects are attached to individual steps.
| Property | Type | Default | Description |
|---|---|---|---|
| type | EffectType | β | Required. Effect name (see tables below). |
| duration | float | null | null | Effect duration in seconds. |
| intensity | float | null | null | Effect intensity (0.0β1.0). |
| color | string | null | null | Effect color (hex). Used by highlight, glow, neon_glow. |
| speed | float | null | null | Animation speed. Used by typewriter, camera_shake, rotate. |
| scale | float | null | null | Zoom scale factor. Used by zoom_pulse, drone_zoom, ken_burns, zoom_to, elastic_zoom. |
| depth | int | null | null | Parallax depth. Used by parallax. |
| direction | string | null | null | Direction ("left", "right", "up", "down"). Used by slide_in, ken_burns, whip_pan, focus_pull. |
| target_x | float | null | null | Normalized X position (0.0β1.0). Used by drone_zoom, zoom_to. |
| target_y | float | null | null | Normalized Y position (0.0β1.0). Used by drone_zoom, zoom_to. |
| angle | float | null | null | Rotation angle in degrees. Used by rotate. |
| ratio | float | null | null | Aspect ratio (e.g. 2.35 for cinemascope). Used by letterbox. |
| preset | string | null | null | Color grade preset ("warm", "cool", "desaturate", "vintage", "cinematic"). Used by color_grade. |
| focus_position | float | null | null | Focus band position (0.0β1.0). Used by tilt_shift. |
Browser Effects (real-time JS injection)
These effects inject CSS/JavaScript into the browser during capture, creating real-time visual overlays.
| Property | Type | Default | Description |
|---|---|---|---|
| spotlight | duration(2), intensity(0.7) | β | Radial gradient spotlight overlay, darkens edges. |
| highlight | duration(2), color(#FFD700), intensity(0.8) | β | Glowing box-shadow on hovered elements. |
| confetti | duration(3), count(150), colors([list]), speed_min(1.5), speed_range(3.0) | β | Animated falling confetti particles (canvas). |
| typewriter | duration(2), caret_color(#333), blink_speed(0.7), bg_color, text_color, font_size(18), label | β | Blinking caret animation on input fields. |
| glow | duration(2), color(#6366f1) | β | Inner box-shadow glow around the viewport. |
| shockwave | duration(0.8), color(#FF5722), glow_color, border_width(4), max_size(600), glow(15) | β | Expanding ring animation from center. |
| sparkle | duration(3), count(80), color(#FFD700), min_size(2), max_size(8) | β | Random sparkling golden dots (canvas). |
| cursor_trail | duration(3), color(#a855f7), size(22), glow(14), fade_duration(1.2), max_dots(80) | β | Trailing particles following the cursor. |
| ripple | duration(0.6), color(#4FC3F7), glow_color, border_width(3), max_size(200), glow(12) | β | Click ripple effect on interactions. |
| neon_glow | duration(2), color(#FF00FF) | β | Neon-colored glow border around the viewport. |
| success_checkmark | duration(1.2), color(#4CAF50), size(140), glow(20), symbol(β) | β | Animated green checkmark overlay. |
| frosted_glass | duration(3), intensity(0.5) | β | Frosted glass blur overlay. |
| morphing_background | duration(5), colors([list]) | β | Animated gradient background morphing. |
| matrix_rain | duration(5), color(#00FF41), density(0.05), speed(1.0) | β | Matrix-style falling green characters. |
| text_highlight | duration(2), color(#FFD700) | β | Highlighted text background animation. |
| text_scramble | duration(2), speed(50) | β | Text scramble/decode animation. |
| magnetic_hover | duration(3), intensity(0.5) | β | Magnetic attraction effect on hover. |
| tooltip_annotation | duration(3), text, color(#333) | β | Tooltip annotation popup. |
| progress_bar | duration(3), color(#4CAF50), position(top), intensity(4) | β | Animated progress bar filling horizontally. |
| countdown_timer | duration(5), color(#333), position(center) | β | Countdown circle timer overlay. |
| callout_arrow | duration(3), text, color(#FF6B6B), target_x(0.5), target_y(0.5) | β | Arrow callout pointing to coordinates. |
effects:
- type: "spotlight"
intensity: 0.8
duration: 2.0effects:
- type: "highlight"
color: "#FFD700"
intensity: 0.9
duration: 2.0effects:
- type: "confetti"
duration: 2.0
count: 200
colors: ["#FF6B6B", "#4ECDC4", "#45B7D1", "#FFA07A"]
speed_min: 2.0effects:
- type: "typewriter"
duration: 2.0effects:
- type: "glow"
color: "#6366f1"
duration: 2.0effects:
- type: "shockwave"
duration: 1.0
color: "#FF5722"
max_size: 800
glow: 20effects:
- type: "sparkle"
duration: 2.0
count: 100
color: "#FFD700"
max_size: 10effects:
- type: "cursor_trail"
duration: 2.0
color: "#a855f7"
size: 22
glow: 14
max_dots: 80effects:
- type: "ripple"
duration: 2.0effects:
- type: "neon_glow"
color: "#FF00FF"
duration: 2.0effects:
- type: "success_checkmark"
duration: 2.0Cursor Trail Variants
Six animated cursor trail styles β each follows mouse movement with a unique visual style. All are browser-injected effects.
| Property | Type | Default | Description |
|---|---|---|---|
| cursor_trail_rainbow | duration(3), size(18), hue_step(12), glow(12), fade_duration(1.4), lifetime(2200) | β | Rainbow-colored dots cycling through hues. |
| cursor_trail_comet | duration(3), color(rgba(168,85,247,1)), glow_color, layers(4), size(22), size_step(3), fade_duration(0.8) | β | Comet tail with size gradient (3 particles per move). |
| cursor_trail_glow | duration(3), color(#00BFFF), size(36), glow_inner(24), glow_outer(48), fade_duration(1.5), lifetime(2000), scale_end(2.5) | β | Soft glowing trail with radial gradient and box-shadow. |
| cursor_trail_line | duration(3), color(rgba(168,85,247,1)), max_points(60), min_width(2), max_width(7) | β | Connected SVG line segments following the cursor. |
| cursor_trail_particles | duration(3), count(6), min_size(8), size_range(6), spread(35), hue_base(180), hue_range(60), glow(8), fade_delay(200), lifetime(1400) | β | Particle burst on each mouse move (5 per event). |
| cursor_trail_fire | duration(3), sparks(5), min_size(10), size_range(12), glow(10), hue_base(10), hue_range(40), fade_delay(300), lifetime(1500) | β | Warm orange/red fire sparks rising and fading. |
effects:
- type: "cursor_trail_rainbow"
duration: 3.0
size: 22
hue_step: 15
glow: 16effects:
- type: "cursor_trail_comet"
duration: 3.0
color: "rgba(168,85,247,1)"
layers: 5
size: 26effects:
- type: "cursor_trail_glow"
color: "#00BFFF"
duration: 3.0effects:
- type: "cursor_trail_line"
duration: 3.0
color: "rgba(168,85,247,1)"
max_points: 80
max_width: 10effects:
- type: "cursor_trail_particles"
duration: 3.0
count: 8
spread: 45
hue_base: 200
hue_range: 80effects:
- type: "cursor_trail_fire"
duration: 3.0
sparks: 8
hue_base: 0
hue_range: 50Fun / Celebration Effects
Six celebration-style canvas overlays for joyful moments. All auto-cleanup after their animation completes.
| Property | Type | Default | Description |
|---|---|---|---|
| emoji_rain | duration(4), count(60), min_size(22), size_range(20), speed_min(1.5), speed_range(2.5), emojis([π,π₯,β€οΈ,β,π,π―]) | β | Rain of emojis (ππ₯β€οΈβππ―) falling from the top. |
| fireworks | duration(3), initial_rockets(8), launch_interval(1200), particles_per_rocket(50), particle_speed_min(1.5), particle_speed_range(4), gravity(0.05), fade_rate(0.012) | β | Rockets launching and exploding into colorful particles. |
| bubbles | duration(4), count(45), min_radius(10), max_radius(35), speed_min(0.5), speed_range(1.5), hue_base(180), hue_range(60) | β | Translucent bubbles rising with sinusoidal wobble. |
| snow | duration(5), count(120), min_radius(3), max_radius(8), color(rgba(200,230,255,0.85)), glow_color, glow(4), speed_min(0.8), speed_max(2.8) | β | Snowflakes drifting down with gentle wind drift. |
| star_burst | duration(3), count(80), speed_min(2), speed_range(5), hue_base(40), hue_range(60), decay(0.006) | β | 5-pointed stars exploding from the center. |
| party_popper | duration(3), count(55), colors([list]), min_size(8), size_range(10), speed_min(4), speed_range(7), gravity(0.12), fade_rate(0.003) | β | Confetti shapes (rect/circle/triangle) from both bottom corners. |
effects:
- type: "emoji_rain"
duration: 4.0
count: 80
emojis: ["π", "π₯", "β€οΈ", "β", "π", "π―", "π"]
speed_min: 2.0effects:
- type: "fireworks"
duration: 5.0
initial_rockets: 12
particles_per_rocket: 80
launch_interval: 800effects:
- type: "bubbles"
duration: 4.0effects:
- type: "snow"
duration: 6.0
count: 150
min_radius: 2
max_radius: 10
speed_min: 0.5
speed_max: 3.0effects:
- type: "star_burst"
duration: 3.0
count: 100
hue_base: 0
hue_range: 360effects:
- type: "party_popper"
duration: 4.0
count: 80
gravity: 0.15
colors: ["#FF6B6B", "#4ECDC4", "#45B7D1", "#FFA07A"]Post-Processing Effects (MoviePy)
These effects are applied to the video during the apply_effects pipeline stage.
| Property | Type | Default | Description |
|---|---|---|---|
| parallax | duration, depth | β | Subtle zoom for a depth illusion. |
| zoom_pulse | duration, scale | β | Pulsing zoom in/out following a sine wave. |
| fade_in | duration | β | Clip fades in from black. |
| fade_out | duration | β | Clip fades out to black. |
| vignette | duration, intensity | β | Dark vignette border around the frame. |
| glitch | duration, intensity | β | Random horizontal slice displacement. |
| slide_in | duration, direction | β | Slide-in entrance animation (implemented as crossfade). |
steps:
- action: "click"
locator: { type: "css", value: "#cta" }
narration: "Click the call to action!"
effects:
- type: "highlight"
color: "#FFD700"
duration: 1.5
- type: "confetti"
duration: 2.0
- type: "zoom_pulse"
scale: 1.2
duration: 1.0duration before the next.steps:
- action: "navigate"
url: "https://fran-cois.github.io/demodsl/"
narration: "Effects are injected via JS during capture."
wait: 2.0
effects:
- type: "spotlight"
duration: 2.0
intensity: 0.8
- action: "scroll"
direction: "down"
pixels: 500
narration: "Highlight adds a glowing box-shadow."
effects:
- type: "highlight"
duration: 2.0
color: "#FFD700"
- action: "scroll"
direction: "down"
pixels: 500
narration: "Glow creates an inner glow."
effects:
- type: "glow"
duration: 2.0
color: "#6366f1"
- action: "scroll"
direction: "down"
pixels: 500
narration: "Neon glow adds a vivid border."
effects:
- type: "neon_glow"
duration: 2.0
color: "#FF00FF"
- action: "screenshot"
narration: "Success checkmark overlay."
effects:
- type: "success_checkmark"
duration: 2.0Camera & Cinematic Effects
13 advanced camera and cinematic effects for professional-looking demos. These are all post-processing effects applied via MoviePy β they simulate real camera movements and cinematic grading on the rendered video.
Camera Movement Effects
| Property | Type | Default | Description |
|---|---|---|---|
| drone_zoom | scale, target_x, target_y | β | Smooth progressive zoom towards a target point β simulates a drone descent. |
| ken_burns | scale, direction | β | Classic documentary pan + zoom (slow push with lateral drift). |
| zoom_to | scale, target_x, target_y | β | Zoom to a specific point and hold β great for highlighting UI elements. |
| dolly_zoom | intensity | β | Vertigo / dolly-zoom: zoom in while widening the crop. |
| elastic_zoom | scale | β | Zoom with elastic overshoot bounce (ease-out-back). |
| camera_shake | intensity, speed | β | Subtle camera shake / handheld feel. |
| whip_pan | direction | β | Fast horizontal/vertical pan with motion blur β great for transitions. |
| rotate | angle, speed | β | Gentle animated rotation β subtle tilt for dynamic feel. |
effects:
- type: "drone_zoom"
scale: 1.4
target_x: 0.5 # center horizontally
target_y: 0.3 # focus on upper thirdeffects:
- type: "ken_burns"
scale: 1.15
direction: "right" # left, right, up, downeffects:
- type: "zoom_to"
scale: 1.8
target_x: 0.5
target_y: 0.4effects:
- type: "dolly_zoom"
intensity: 0.3effects:
- type: "elastic_zoom"
scale: 1.3effects:
- type: "camera_shake"
intensity: 0.3
speed: 8.0effects:
- type: "whip_pan"
direction: "right" # left, right, up, downeffects:
- type: "rotate"
angle: 3.0 # degrees
speed: 1.0 # oscillations per clipCinematic Effects
| Property | Type | Default | Description |
|---|---|---|---|
| letterbox | ratio | β | Cinematic black bars (e.g. 2.35:1 cinemascope). |
| film_grain | intensity | β | Analog film grain overlay. |
| color_grade | preset | β | Color grading presets: warm, cool, desaturate, vintage, cinematic. |
| focus_pull | direction, intensity | β | Rack focus: transition from sharp to blurry (or reverse). |
| tilt_shift | intensity, focus_position | β | Miniature / tilt-shift: sharp band in center, blurred edges. |
effects:
- type: "letterbox"
ratio: 2.35 # cinemascopeeffects:
- type: "film_grain"
intensity: 0.3effects:
- type: "color_grade"
preset: "cinematic" # warm, cool, desaturate, vintage, cinematiceffects:
- type: "focus_pull"
direction: "out" # in = blurβsharp, out = sharpβblur
intensity: 0.5effects:
- type: "tilt_shift"
intensity: 0.6
focus_position: 0.5 # 0.0=top, 0.5=center, 1.0=bottomletterbox + color_grade + film_grain for a cinematic look, or drone_zoom + vignette for a dramatic reveal.steps:
- action: "navigate"
url: "https://example.com"
narration: "A cinematic reveal of our product."
effects:
- type: "drone_zoom"
scale: 1.4
target_x: 0.5
target_y: 0.3
- type: "letterbox"
ratio: 2.35
- type: "color_grade"
preset: "cinematic"
- type: "film_grain"
intensity: 0.2
- type: "vignette"
intensity: 0.4pipeline
The pipeline defines the post-processing chain using a Chain of Responsibility pattern. Each stage is a single-key dictionary. Stages execute in order, passing context to the next.
Each stage is either critical (failure stops the pipeline) or optional (failure is logged and skipped).
| Property | Type | Default | Description |
|---|---|---|---|
| restore_audio | optional | { denoise, normalize } | Audio restoration: noise removal, loudness normalization. |
| restore_video | optional | { stabilize, sharpen } | Video restoration: stabilization, sharpening. |
| apply_effects | optional | {} | Apply post-processing visual effects from step definitions. |
| generate_narration | critical | {} | Generate TTS audio clips and sync to video timeline. |
| composite_avatar | optional | {} | Overlay avatar clips on the video. Requires avatar config in the scenario. |
| burn_subtitles | optional | {} | Burn ASS subtitles into the video. Requires subtitle config (top-level or per-scenario). |
| render_device_mockup | optional | {} | Overlay video into a 3D device frame. |
| edit_video | critical | {} | Apply intro, outro, transitions, and watermark. |
| mix_audio | critical | {} | Mix voice narration with background music (ducking). |
| optimize | critical | { format, codec, quality, target_size_mb } | Final encoding, compression, and format export. |
| fit_duration | optional | { target_duration, strategy, min_speed, max_speed } | Adjust video speed so that the final video matches a target duration. |
pipeline:
# Each stage is a single-key dict
- restore_audio:
denoise: true
normalize: true
- restore_video:
stabilize: true
sharpen: true
- apply_effects: {}
- generate_narration: {}
- composite_avatar: {}
- burn_subtitles: {}
- render_device_mockup: {}
- edit_video: {}
- mix_audio: {}
- fit_duration:
target_duration: 60
strategy: "any"
- optimize:
format: "mp4"
codec: "h264"
quality: "high"
target_size_mb: 50optimize stage parameters
| Property | Type | Default | Description |
|---|---|---|---|
| format | string | "mp4" | Output format: "mp4", "webm", "gif". |
| codec | string | "h264" | Video codec: "h264", "h265", "vp9", etc. |
| quality | string | "high" | Encoding quality: "low", "medium", "high". |
| target_size_mb | int | null | null | Target file size in MB. Overrides quality if set. |
{} (empty dict) use all defaults. Each stage dict must have exactly one key β multiple keys in a single dict will raise a validation error.generate_narration, edit_video, mix_audio, and optimize.fit_duration stage parameters
Automatically adjusts video playback speed so that the final video matches a given target_duration in seconds. Useful when you need to produce a demo that fits a specific time slot (e.g. a 60-second social clip or a 3-minute explainer).
| Property | Type | Default | Description |
|---|---|---|---|
| target_duration | float | β | Required. Target duration in seconds. |
| strategy | "any" | "speed_up" | "slow_down" | "any" | Direction constraint. "any" allows both speed up and slow down. |
| min_speed | float | 0.25 | Minimum speed factor (prevents extreme slow-motion). |
| max_speed | float | 4.0 | Maximum speed factor (prevents unwatchable fast-forward). |
pipeline:
- generate_narration: {}
- edit_video: {}
- mix_audio: {}
- fit_duration:
target_duration: 60 # make the video exactly 60 seconds
strategy: "any" # speed up or slow down as needed
min_speed: 0.5 # never slower than 0.5x
max_speed: 3.0 # never faster than 3xfit_duration after edit_video and speed but before optimize so that intro/outro and manual speed changes are applied first, then the whole video is time-fitted.output
Defines output filenames, formats, thumbnail generation, and social media export presets.
| Property | Type | Default | Description |
|---|---|---|---|
| filename | string | "output.mp4" | Main output filename. |
| directory | string | "output/" | Output directory path. |
| formats | string[] | ["mp4"] | Export formats: "mp4", "webm", "gif". |
| thumbnails | Thumbnail[] | null | Auto-generated thumbnail frames. |
| social | SocialExport[] | null | Platform-specific export presets. |
output.thumbnails
| Property | Type | Default | Description |
|---|---|---|---|
| timestamp | float | β | Required. Time in seconds to capture the thumbnail. |
output.social
Generate platform-optimized versions automatically. Each preset re-encodes the video with platform-specific constraints.
| Property | Type | Default | Description |
|---|---|---|---|
| platform | string | β | Required. Platform name (for labeling). |
| resolution | string | null | null | Output resolution (e.g. "1920x1080"). |
| bitrate | string | null | null | Target bitrate (e.g. "8000k"). |
| aspect_ratio | string | null | null | Crop to aspect ratio (e.g. "1:1", "9:16"). |
| max_duration | int | null | null | Maximum duration in seconds (trims end). |
| max_size_mb | int | null | null | Maximum file size in MB. |
output:
filename: "demo.mp4"
directory: "output/"
formats:
- "mp4"
- "webm"
- "gif"
thumbnails:
- timestamp: 0.0
- timestamp: 5.0
- timestamp: 10.0
social:
- platform: "youtube"
resolution: "1920x1080"
bitrate: "8000k"
- platform: "instagram"
resolution: "1080x1080"
aspect_ratio: "1:1"
max_duration: 60
- platform: "twitter"
resolution: "1280x720"
max_duration: 140
max_size_mb: 15analytics Beta
Optional engagement tracking metadata embedded in the output.
| Property | Type | Default | Description |
|---|---|---|---|
| track_engagement | bool | false | Track viewer engagement metrics. |
| heatmap | bool | false | Generate click/attention heatmap data. |
| click_tracking | bool | false | Track interactive click positions. |
analytics:
track_engagement: true
heatmap: true
click_tracking: truePre-flight Checks
Before launching the recording browser, DemoDSL probes the URLs your scenario depends on so you find out why a recording shows a challenge or empty page before waiting for the full render. All pre-flight checks are advisory: they never abort the demo. Network failures (DNS, timeout, offline) are treated as "passing" so demos still run on locked-down machines.
Page accessibility (anti-bot / WAF detection)
Many production sites are fronted by anti-bot or WAF services that serve a JavaScript challenge or a 403/429/503 to non-browser clients. Recording such a page captures the challenge instead of your real UI. DemoDSL fetches each scenario.url and everynavigate step URL with a short GET probe (only the first 16Β KB of the body) and inspects the response for known fingerprints. When a block is detected a WARNINGis logged with the protection name, the HTTP status and the matching signal β the demo still runs so the recording acts as documentation of the issue.
Detected protections include:
| Property | Type | Default | Description |
|---|---|---|---|
| cloudflare | WAF | β | cf-ray, cf-mitigated, __cf_bm, βJust a momentβ¦β, Turnstile, Attention Required. |
| datadome | Anti-bot | β | x-dd-b / x-datadome headers, datadome cookie, geo.captcha-delivery.com markers. |
| akamai | WAF / Bot Manager | β | AkamaiGHost server, ak_bmsc / _abck cookies, βAccess Deniedβ reference page. |
| imperva | WAF (Incapsula) | β | X-Iinfo header, visid_incap_ / incap_ses_ cookies, βIncapsula incident idβ. |
| aws-waf | WAF | β | x-amzn-waf-action header, aws-waf-token cookie, AWSWAFCaptcha page. |
| f5-shape | WAF / bot defense | β | BIG-IP cookie + block, βThe requested URL was rejectedβ, Shape JS. |
| sucuri | WAF | β | Server: Sucuri/Cloudproxy, Sucuri firewall block page. |
| kasada | Anti-bot | β | x-kpsdk-ct header, /kpsdk/ markers. |
| perimeterx | Anti-bot (HUMAN) | β | _px3 / _pxhd cookies, px-captcha / _pxCaptcha markers. |
| captcha | Generic gate | β | hCaptcha, reCAPTCHA, Cloudflare Turnstile, Arkose/FunCaptcha interstitials. |
| β | HTTP status | β | 401, 403, 404, 429, 451, 5xx with friendlier reason text. |
Sample log output for a Cloudflare-protected URL:
WARNING [page-precheck] https://example.com β not accessible Β· protected by cloudflare Β· HTTP 403 Β· cf-mitigated: challenge β the recorded demo is likely to show a challenge or error page. Consider using a different URL, a fixture/screenshot, or running the demo on an allow-listed network.Recommended workarounds when a URL is blocked:
- Run the demo from an IP that is on the site's allow-list (office network, VPN).
- Replace the live URL with a static fixture, a local mirror, or a pre-recorded screenshot.
- For partner sites, ask for a User-Agent / IP exemption for the demo runner.
- If the page is only blocked for the precheck (HEAD/GET) but loads in a real browser, the warning is harmless β the recording will still succeed.
You can call the probe programmatically:
from demodsl.page_precheck import probe_page_accessible, precheck_urls
result = probe_page_accessible("https://example.com")
if not result.accessible:
print(result.format_warning())
# β "https://example.com β not accessible Β· protected by cloudflare Β· HTTP 403 Β· cf-mitigated: challenge"
# Batch probe with built-in WARNING logging
precheck_urls(["https://a.example", "https://b.example"])Iframe embeddability (secondary windows)
When a scenario uses background.secondary_windows[].urlto embed a live site behind the main browser, DemoDSL probes each URL for headers that block iframing:
X-Frame-Options: DENYorSAMEORIGINContent-Security-Policy: frame-ancestorswith a restrictive value (other than*,http:,https:)
When a window's URL is blocked, DemoDSL automatically records a short headless clip of the page in an isolated Playwright instance and substitutes a muted, looping <video> overlay for the iframe. If recording fails (or the helper is disabled), the window falls back to its background_color /screenshot. This step happens before the main browser is launched to avoid running two Chromium processes concurrently on memory-constrained hosts.
Recordings are cached on disk (keyed by URL + dimensions), so repeat runs are fast.
CLI Reference
demodsl run
Parse and execute a DemoDSL config file.
demodsl run <config> [OPTIONS]
Arguments:
config Path to the YAML or JSON config file.
Options:
-o, --output-dir PATH Output directory (default: output/)
--dry-run Validate and log all steps without executing
--skip-voice Skip TTS generation (development mode)
--turbo Fast preview: minimal waits, skip heavy post-processing
-v, --verbose Enable debug logging--dry-run to validate your config and preview all steps without launching a browser or calling TTS APIs.Turbo mode
Add --turbo to generate a fast preview. All browser waits are clamped to 50βms, and heavy post-processing passes are skipped: avatar compositing, 3D device rendering, subtitle burning, post-effects (freeze frames, speed ramps), global speed re-encode, and watermark overlay.
demodsl run demo.yaml --turboThe YAML config itself does not change β turbo is purely a runtime flag. Define avatars, subtitles, and effects as usual, then iterate quickly with --turbo and remove it for the final high-quality render.
| Property | Type | Default | Description |
|---|---|---|---|
| Skipped | Avatars, 3D rendering, subtitles, post-effects, speed re-encode, watermark | ||
| Kept | Browser recording, narration (TTS), browser effects, basic video editing | ||
| Waits | All time.sleep() pauses clamped to 50 ms |
demodsl validate
Validate a config file without executing any actions.
demodsl validate <config> [OPTIONS]
Arguments:
config Path to the YAML or JSON config file.
Options:
-v, --verbose Enable debug loggingOutputs a summary: title, version, number of scenarios, total steps, and pipeline stage count. Exits with code 1 on validation failure.
demodsl init
Generate a minimal config template.
demodsl init [OPTIONS]
Options:
-o, --output PATH Output file (default: demo.yaml)
Use .json extension for JSON output.
Examples:
demodsl init # Creates demo.yaml
demodsl init -o my-demo.yaml # Custom filename
demodsl init -o demo.json # JSON formatEdge Cases & Gotchas
Minimal Config
The smallest valid config requires only metadata.title. Everything else has defaults or is optional:
metadata:
title: "Empty Demo"This will validate successfully but produce no output (no scenarios, no pipeline).
YAML vs JSON Detection
File format is detected by file extension only: .json β JSON parser, anything else β YAML parser. If you name a JSON file config.yaml, it will fail to parse.
Voice Provider Fallback
If the configured TTS engine's API key is missing, DemoDSL falls back to DummyVoiceProvider which generates silent audio clips. The dummy calculates duration from word count at ~150 words per minute. This is intentional behavior for local development.
Pipeline Stage Format
Each pipeline entry must be a single-key dictionary. Multiple keys in one entry will raise a validation error:
pipeline:
- restore_audio: { denoise: true }
restore_video: { sharpen: true } # Error!pipeline:
- restore_audio: { denoise: true }
- restore_video: { sharpen: true }Critical vs Optional Stages
If a critical stage fails (generate_narration, edit_video, mix_audio, optimize), the entire pipeline stops and raises an error. If an optional stage fails (restore_audio, restore_video, apply_effects, render_device_mockup), it logs a warning and the pipeline continues.
Effect Execution Order
Browser effects are injected before the step action and execute sequentially. If an effect has a duration, the engine sleeps for that duration before the next effect. All effects complete before the browser action fires.
Required Fields by Action
| Property | Type | Default | Description |
|---|---|---|---|
| navigate | url | β | Raises ValueError if url is missing. |
| click | locator | β | Raises ValueError if locator is missing. |
| type | locator + value | β | Raises ValueError if either is missing. |
| scroll | (none) | β | Defaults: direction="down", pixels=300. |
| wait_for | locator | β | Raises ValueError if locator is missing. Default timeout: 5s. |
| screenshot | (none) | β | Default filename: "screenshot.png". |
Viewport and Recording
The browser records video at the viewport resolution. For high-quality social media exports, set the scenario viewport to the maximum resolution you need β downscaling social presets is better than upscaling.
Ducking Without Music
If audio.background_music is not set or the file doesn't exist, the mix_audio stage skips music mixing entirely. Narration-only audio still works.
Empty Pipeline
If pipeline is an empty list or omitted, no post-processing runs. Raw browser recordings are copied directly to the output directory.
Multiple Scenarios
Multiple scenarios are executed sequentially. Each gets its own browser instance. Currently, the pipeline processes only the first scenario's video. Multi-scenario concatenation is handled by the edit_video stage.
Dry Run Behavior
With --dry-run, the engine validates the config, logs every step and effect with [DRY-RUN] prefix, but does not:
- Launch a browser
- Call any TTS API
- Execute pipeline stages
- Produce any output files
Environment Variables
| Property | Type | Default | Description |
|---|---|---|---|
| ELEVENLABS_API_KEY | string | β | ElevenLabs TTS API key. |
| OPENAI_API_KEY | string | β | OpenAI API key. Required for openai engine. |
| GOOGLE_APPLICATION_CREDENTIALS | string | β | Path to Google Cloud service account JSON file. |
| AZURE_SPEECH_KEY | string | β | Azure Cognitive Services Speech subscription key. |
| AZURE_SPEECH_REGION | string | "eastus" | Azure region (e.g. eastus, westeurope). |
| AWS_ACCESS_KEY_ID | string | β | AWS access key for Polly. |
| AWS_SECRET_ACCESS_KEY | string | β | AWS secret key for Polly. |
| AWS_DEFAULT_REGION | string | "us-east-1" | AWS region for Polly. |
| COSYVOICE_API_URL | string | "http://localhost:50000" | CosyVoice API server URL. |
| COQUI_MODEL | string | "xtts_v2" | Coqui TTS model name (default: xtts_v2). |
| COQUI_LANGUAGE | string | "en" | Language code for Coqui TTS. |
| PIPER_BIN | string | "piper" | Path to piper binary. |
| PIPER_MODEL | string | β | Required. Path to Piper .onnx voice model. |
| LOCAL_TTS_URL | string | "http://localhost:8000" | Base URL for OpenAI-compatible local TTS server. |
| LOCAL_TTS_API_KEY | string | "not-needed" | API key for local server (if required). |
| LOCAL_TTS_MODEL | string | "tts-1" | Model name to pass to local server. |
| ESPEAK_BIN | string | "espeak-ng" | Path to eSpeak-NG binary. |
| CUSTOM_TTS_URL | string | β | Required. Full URL of your custom TTS HTTP endpoint. |
| CUSTOM_TTS_API_KEY | string | β | Bearer token for custom TTS (optional). |
| CUSTOM_TTS_RESPONSE_FORMAT | string | "mp3" | Audio format returned by the endpoint: "mp3" or "wav". |
| D_ID_API_KEY | string | β | D-ID API key for talking-head avatar generation. |
| HEYGEN_API_KEY | string | β | HeyGen API key for avatar video generation. |
If the required environment variable for the selected engine is not set, DemoDSL automatically falls back to DummyVoiceProvider which generates silent audio clips. This allows development without API credentials.