Skillquality 0.46

video-generation

End-to-end AI video production through the Hyper MCP — text-to-video and image-to-video generation (Sora, Veo, Seedance), scene chaining, video analysis, transcription, subtitles, TikTok / karaoke captions, voiceover (TTS), audio mixing, clipping, stitching, and text overlays. Us

Price

free

Protocol

skill

Verified

Endpoint

https://skills.sh/hyperfx-ai/marketing-skills/video-generation

What it does

Video Generation & Editing

Guide for generating, editing, analyzing, and post-processing videos using AI models and FFmpeg-backed tools exposed through the Hyper MCP.

Requirements

This skill assumes the Hyper MCP is connected to your agent so the tools below are available. The underlying providers (OpenAI Sora, Google Veo, ByteDance Seedance, OpenAI TTS, transcription, etc.) are configured under your Hyper integrations.

Tool surface

Group	Tools
Generation	`generate_video`, `sora_remix_video`, `sora_delete_video`
Analysis	`analyze_video`, `capture_video_frame`, `transcribe_video`
Subtitles & captions	`generate_subtitles`, `burn_subtitles`, `burn_highlighted_captions`
Audio	`text_to_speech`, `add_audio_to_video`
Editing	`clip_video`, `stitch_videos`, `overlay_text`

Out of scope

Image generation, ad creative composition, brand extraction — use image-generation or ad-creative-generation.
Posting finished videos to social platforms — use tiktok, instagram, or linkedin.
Running paid video campaigns — use google-ads, meta-ads, tiktok-ads.

Available Tools

Tool	Purpose	Runs in Background
`generate_video`	Generate video from text / image prompt	Yes
`sora_remix_video`	Modify existing Sora video	Yes
`sora_delete_video`	Delete a Sora video	No
`capture_video_frame`	Extract frame as image	No
`analyze_video`	Watch and understand video content	No
`transcribe_video`	Extract audio transcript	No
`generate_subtitles`	Create SRT / VTT subtitle file	No
`burn_subtitles`	Burn subtitles onto video	Yes
`burn_highlighted_captions`	TikTok / karaoke-style word-by-word captions	Yes
`text_to_speech`	Generate voiceover audio from text	No
`add_audio_to_video`	Add / replace audio track on video	Yes
`clip_video`	Extract a time segment from video	Yes
`stitch_videos`	Concatenate multiple clips	Yes
`overlay_text`	Add text / titles to video	Yes

Video Understanding

You can watch and analyze any video using analyze_video. This sends the video to a multimodal AI that sees both visual and audio content.

When to use `analyze_video`

After generating a video: check if it matches your intent
Before stitching: verify scene consistency across clips
Quality review: check for glitches, character drift, lighting issues
Content understanding: "what happens in this video?"

Analysis Types

analyze_video(file_id="...", analysis_type="general")
analyze_video(file_id="...", analysis_type="quality_review")
analyze_video(file_id="...", analysis_type="scene_breakdown")
analyze_video(file_id="...", question="Does this match: [original prompt]?")

Self-Review Workflow

Always review generated videos before delivering to the user:

result = generate_video(prompt="...", model="veo-3.1-generate-preview")
review = analyze_video(file_id="video_file_id", analysis_type="quality_review")
# If issues found, regenerate with adjustments. If quality is good, proceed to editing.

Script Planning

For longer, cohesive videos, plan the FULL SCRIPT before generating:

1. Scene Breakdown

Scenes: break story into segments
- Sora: 4 / 8 / 12 seconds per scene
- Veo: 4-8 seconds per scene
- Seedance: 4-15 seconds per scene (native audio with lip-sync)
Camera: shot type (wide, close-up, tracking), angles, movement
Transitions: how each scene connects to the next
Consistency: character descriptions, color palette, visual style

Scene Chaining Technique

To create seamless multi-scene videos:

Scene 1 (text-to-video)

generate_video(prompt="...", model="veo-3.1-generate-preview")

Scene 2+ (image-to-video)

capture_video_frame(video_file_id="scene1_file_id", frame_position="last")
generate_video(prompt="continuation: ...", image_file_id="captured_frame_id")

Repeat: extract last frame → generate next scene.

Stitching Scenes Together

After generating all scenes, combine them:

stitch_videos(video_file_ids=["scene1_id", "scene2_id", "scene3_id"])

stitch_videos(
    video_file_ids=["scene1_id", "scene2_id", "scene3_id"],
    transition="crossfade",
    crossfade_duration=0.5,
)

Subtitle / Caption Workflow

Full pipeline: Video → Transcript → Subtitles → Burned Video

transcript = transcribe_video(file_id="video_file_id")

subs = generate_subtitles(file_id="video_file_id", transcript=transcript, format="srt")

burn_subtitles(
    video_file_id="video_file_id",
    subtitle_file_id=subs.file_id,
    style="bold_outline",
    position="bottom",
)

Subtitle Styles

Style	Effect
`default`	Plain white text
`bold_outline`	Bold white with black outline (recommended)
`shadow`	White text with drop shadow
`box`	White text on semi-transparent black box

Text Overlays

Add titles, lower-thirds, CTAs, and other graphics:

overlay_text(
    video_file_id="video_file_id",
    overlays=[
        {
            "text": "Episode 1: The Beginning",
            "start_time": 0.0,
            "end_time": 3.0,
            "position": "center",
            "font_size": 48,
            "color": "white",
            "background": "black@0.5",
        },
        {
            "text": "Subscribe for more!",
            "start_time": 10.0,
            "end_time": 14.0,
            "position": "bottom-right",
            "font_size": 28,
        },
    ],
)

Overlay Positions

top, bottom, center, top-left, top-right, bottom-left, bottom-right

Voiceover / Narration

Generate natural-sounding voiceover with TTS and add it to any video:

audio = text_to_speech(
    text="Welcome to our product. Here's how it works...",
    voice="nova",
    model="tts-1",
)

add_audio_to_video(
    video_file_id="video_id",
    audio_file_id=audio.file_id,
    mode="replace",
)

add_audio_to_video(
    video_file_id="video_id",
    audio_file_id=audio.file_id,
    mode="mix",
    audio_volume=0.8,
)

Available Voices

alloy, ash, coral, echo, fable, nova (recommended), onyx, sage, shimmer

Highlighted Captions (TikTok / Reels Style)

Add word-by-word highlighted captions that light up as spoken:

burn_highlighted_captions(
    video_file_id="video_id",
    style="tiktok",
    highlight_color="#3B82F6",
    base_color="white",
    words_per_group=3,
    position="center",
)

burn_highlighted_captions(
    video_file_id="video_id",
    style="karaoke",
    highlight_color="yellow",
    base_color="white",
    background="black@0.6",
    words_per_group=4,
    position="bottom",
)

Video Clipping

Extract segments from longer videos:

clip_video(
    video_file_id="long_video_id",
    start_time=45.0,
    end_time=60.0,
)

UGC / TikTok Production Workflow

Complete workflow for producing UGC-style content:

Script: plan scenes, dialogue, and visual style
Generate: create each scene with generate_video
Review: use analyze_video to check each scene for quality
Chain: extract last frames with capture_video_frame, generate next scenes
Stitch: combine all scenes with stitch_videos
Narrate: generate voiceover with text_to_speech + add_audio_to_video
Caption: add TikTok-style captions with burn_highlighted_captions
Overlay: add titles / CTAs with overlay_text
Final review: use analyze_video on the final video for quality check

Example: Narrated UGC Video

generate_video(prompt="...", model="veo-3.1-generate-preview")

audio = text_to_speech(text="Your narration script here...", voice="nova")

add_audio_to_video(video_file_id="generated_video_id", audio_file_id=audio.file_id)

burn_highlighted_captions(video_file_id="narrated_video_id", style="tiktok")

Example: Podcast to Short-Form Clips

transcript = transcribe_video(file_id="podcast_video_id")

analysis = analyze_video(
    file_id="podcast_video_id",
    question="Identify the 3 most memorable / quotable moments with timestamps",
    analysis_type="scene_breakdown",
)

clip_video(video_file_id="podcast_video_id", start_time=120.0, end_time=150.0)
clip_video(video_file_id="podcast_video_id", start_time=340.0, end_time=365.0)

stitch_videos(video_file_ids=["clip1_id", "clip2_id"])

burn_highlighted_captions(video_file_id="stitched_id", style="tiktok")

Prompt Structure

Each scene prompt should include:

"Continuation of previous scene" (for scenes 2+)
Consistent character / setting descriptions
Specific action for this segment
Camera movement direction

Control Principles (most important)

Treat API params as the container and prompt text as the content:
- model, size / aspect_ratio, and duration_seconds must be set explicitly in the tool call.
- Do not expect prose like "make it longer" or "make it vertical" to override API parameters.
Use detail for control, brevity for exploration:
- Short prompts give more creative variation.
- Detailed prompts improve consistency and shot control.
Iterate in small steps:
- Change one major variable at a time (camera, lighting, action, or palette).
- Keep what works fixed and only modify the target dimension.

Key Parameters

Parameter	Description
`image_file_id`	Use for image-to-video (scene continuity)
`capture_video_frame`	Extract frames with `position="last"` \| `"first"` \| `"middle"`
`size`	For Sora only. One of: `"720x1280"`, `"1280x720"`, `"1024x1792"`, `"1792x1024"`
`aspect_ratio`	For Veo only. One of: `"16:9"` or `"9:16"`
`duration_seconds`	Sora: 4, 8, or 12 seconds only. Veo: 4-8 seconds

Important Input Rules

Use exact values accepted by the tool schema. Do not send aliases like landscape, portrait, 720p, or 1080p.
For Sora, prefer size and do not send aspect_ratio.
For Veo, prefer aspect_ratio and do not send size.
For Seedance, use aspect_ratio and optionally resolution. Do not pass size.
Keep the same size / aspect_ratio across chained scenes for continuity.

Model Selection Guide

When the user mentions a specific model name, always use that model. Map user requests to the correct model parameter:

User says	`model` parameter
"use seedance", "seedance video"	`"seedance-2"`
"fast seedance"	`"seedance-2-fast"`
"use sora", "sora video"	`"sora-2"`
"sora pro"	`"sora-2-pro"`
"use veo", "veo video"	`"veo-3.1-generate-preview"`
"fast veo"	`"veo-3.1-fast-generate-preview"`

Model-Specific Parameter Matrix

Sora models (sora-2, sora-2-pro)
- Allowed sizing parameter: size
- Allowed size values: "720x1280", "1280x720", "1024x1792", "1792x1024"
- Do not pass aspect_ratio
- Practical default pair: "1280x720" or "720x1280"
Veo models (veo-3.1-generate-preview, veo-3.1-fast-generate-preview)
- Allowed sizing parameter: aspect_ratio
- Allowed aspect_ratio values: "16:9", "9:16"
- Do not pass size
Seedance models (seedance-2, seedance-2-fast)
- Allowed sizing parameters: aspect_ratio and resolution
- Allowed aspect_ratio values: "16:9", "9:16", "1:1", "4:3", "3:4"
- Allowed resolution values: "480p", "720p"
- Supports generate_audio=true for native audio with lip-sync
- Do not pass size

Duration Limits

Sora: 4, 8, or 12 seconds per generation. Use scene chaining + stitch_videos for longer videos.
Veo: 4, 5, 6, 7, or 8 seconds per generation.
Seedance: 4-15 seconds per generation (most flexible). Supports generate_audio=true for native audio with lip-sync.

High-Control Prompt Template

Use this structure when you want predictable output:

Style/Tone: [realistic, cinematic, animation, documentary, etc.]
Subject/World: [who/what is in frame, key visual anchors]
Camera: [shot size + angle + movement]
Lighting/Palette: [light direction + 3-5 color anchors]
Action Beats:
- [beat 1 with timing/count]
- [beat 2 with timing/count]
- [beat 3 with timing/count]
Audio/Dialogue: [short lines or ambient cues]
Constraints: [no logos/brands, no text overlays, etc.]

Best Practices

Review before delivering: always use analyze_video to check your output.
Maintain visual consistency: use the same character descriptions, lighting, and style across all scenes.
Plan transitions: design the end of each scene to flow into the next.
Batch similar scenes: generate scenes with similar settings together.
Review before chaining: check each scene before using its last frame for the next.
Use single-variable iteration: remix / regenerate by changing one variable at a time.
Add captions for accessibility: use the subtitle pipeline for all UGC content.

Capabilities

skillsource-hyperfx-aiskill-video-generationtopic-agent-skillstopic-ai-agenttopic-claudetopic-claude-codetopic-claude-skillstopic-codextopic-cursortopic-google-adstopic-hermestopic-marketingtopic-mcptopic-meta-ads

Install

Installnpx skills add hyperfx-ai/marketing-skills

Sourcehttps://github.com/hyperfx-ai/marketing-skills/tree/main/skills/video-generation

skills.shhttps://skills.sh/hyperfx-ai/marketing-skills/video-generation

Transportskills-sh

Protocolskill

Quality

0.46/ 1.00

deterministic score 0.46 from registry signals: · indexed on github topic:agent-skills · 24 github stars · SKILL.md body (13,193 chars)

Provenance

Indexed fromgithub

Enriched2026-05-18 19:05:17Z · deterministic:skill-github:v1 · v1

First seen2026-05-07

Last seen2026-05-18

Agent access

JSONhttps://clawmart.sh/api/listings/Rtk3Hp