@happyvertical/smrt-video
AI video production pipeline with characters, performers, scenes, shots, sequences, compositions, and ComfyUI workflow integration.
v0.20.44Video PipelineComfyUIFrame-Based
Overview
smrt-video models the full video production pipeline for AI-powered video generation. Characters define virtual personas with voice and branding, Performers provide physical likeness via IP-Adapter FaceID, and the Composition-Sequence-Shot hierarchy organizes generated content. ComfyUI workflows enable dynamic parameter injection for rendering.
Installation
bash
npm install @happyvertical/smrt-videoQuick Start
typescript
import {
Character, Performer, Scene,
VideoShot, VideoSequence, VideoComposition,
VideoShotCharacter, VideoWorkflow,
} from '@happyvertical/smrt-video';
// Character = virtual persona (outfit, voice, branding)
const anchor = new Character({
name: 'Bentley News Anchor',
imageAssetId: 'seed-img-001',
voiceProfileId: 'voice-123',
brandingKit: {
logoAssetId: 'logo-asset',
primaryColor: '#1a73e8',
lowerThirdTemplate: 'news-standard',
tickerEnabled: true,
},
});
await anchor.save();
// Performer = physical likeness for IP-Adapter face consistency
const performer = new Performer({
name: 'Alex',
ipAdapterWeight: 0.85,
});
// Scene = virtual background
const studio = new Scene({
name: 'News Studio',
sourceType: 'image',
projection: 'flat',
});
// Hierarchy: Composition -> Sequence -> Shot
const composition = new VideoComposition({
title: 'Evening News - March 2, 2026',
fps: 30,
width: 1920,
height: 1080,
});
await composition.save();
const shot = new VideoShot({
scriptText: 'Welcome to the evening news broadcast.',
targetDuration: 30,
});
await shot.save();
// Estimated speech: scriptWordCount / 2.7 words per second
// ComfyUI workflow with parameter injection
const workflow = new VideoWorkflow({
name: 'Wan 2.6 + EchoMimic',
workflowType: 'broadcast',
workflowJson: comfyuiApiJson,
nodeMapping: { seedImage: '1', audioFile: '5', outputVideo: '12' },
requiredModels: ['wan_2.6_t2v_14b_fp8', 'echomimic_v2'],
});
await workflow.save();
// Inject runtime parameters into a deep-cloned workflow
const injected = workflow.injectParameters({
seedImage: '/path/to/anchor.png',
audioFile: '/path/to/tts.wav',
});Core Models
Character
typescript
class Character extends SmrtObject {
name: string
imageAssetId?: string // Seed image FK
voiceProfileId?: string // FK to smrt-voice
brandingKit?: BrandingConfig // Logo, colors, fonts, lower-thirds
status: 'pending' | 'ready'
}VideoShot (extends Content)
typescript
class VideoShot extends Content {
scriptText?: string
scriptWordCount: number
durationInFrames: number
videoMetadata?: VideoMetadata // Includes wordTimings for lip-sync
status: 'draft' | 'queued' | 'processing' | 'ready' | 'failed' | 'published'
get estimatedDuration(): number // scriptWordCount / 2.7 (words/sec)
}VideoComposition (extends Content)
typescript
class VideoComposition extends Content {
fps: number
width: number
height: number
durationInFrames: number
renderStatus: 'draft' | 'rendering' | 'ready' | 'failed'
renderProgress: number
}VideoWorkflow (ComfyUI)
typescript
class VideoWorkflow extends SmrtObject {
name: string
workflowType: 'prebake' | 'broadcast' | 'lipsync' | 'postprod' | 'custom'
workflowJson: string // Full ComfyUI API JSON
nodeMapping: NodeMapping // Maps semantic names -> node IDs
requiredModels?: string[]
// Deep-clones workflow and overwrites node.inputs
injectParameters(params: Record<string, any>): object
}Best Practices
DOs
- Store durations as frames, compute seconds with
frames / fps - Use
nodeMappingto map semantic names to ComfyUI node IDs - Use
injectParameters()for safe workflow parameter injection (deep-clones) - Estimate speech duration at 2.7 words/second (with 15% tolerance)
- Link Characters to VoiceProfiles from smrt-voice for TTS integration
DON'Ts
- Don't store durations as seconds (use
durationInFrameseverywhere) - Don't assume
wordTimingsis auto-generated (requires external TTS provider) - Don't mutate workflow JSON directly (use
injectParameters()for safe cloning) - Don't forget
trimBeforeFrames/trimAfterFramesin effective frame calculations - Don't upload face embeddings through the framework (weight is metadata-only)