Interface SceneDescription

A single scene detected within a video, with timestamps, description, and optional transcript.

Scenes are contiguous segments of video bounded by visual discontinuities (hard cuts, dissolves, fades). The SceneDetector identifies boundaries, and a vision LLM describes the content of each scene.

This is a richer version of the base VideoScene type that includes cut-type classification, confidence, transcript, and key frame data.

interface SceneDescription {
    index: number;
    startSec: number;
    endSec: number;
    durationSec: number;
    cutType: "hard-cut" | "dissolve" | "fade" | "wipe" | "gradual" | "start";
    description: string;
    transcript?: string;
    keyFrame?: string;
    confidence: number;
}

Index

Properties

index startSec endSec durationSec cutType description transcript? keyFrame? confidence

Properties

index

index: number

0-based scene index within the video.

startSec

startSec: number

Start time of the scene in seconds from video start.

endSec

endSec: number

End time of the scene in seconds from video start.

durationSec

durationSec: number

Duration of the scene in seconds (endSec - startSec).

cutType

Type of visual transition that marks the beginning of this scene.

'hard-cut' — Abrupt frame-to-frame change
'dissolve' — Cross-dissolve / superimposition transition
'fade' — Fade from/to black or white
'wipe' — Directional wipe transition
'gradual' — Other gradual transition not fitting the above
'start' — First scene in the video (no preceding transition)

description

description: string

Natural-language description of the scene content, generated by a vision LLM from the key frame.

`Optional` transcript

transcript?: string

Transcript of speech/narration during this scene's time range. Only populated when audio transcription is enabled.

`Optional` keyFrame

keyFrame?: string

Base64-encoded key frame image (JPEG) representative of the scene. Typically the frame closest to the scene midpoint.

confidence

confidence: number

Confidence score (0-1) for the scene boundary detection. Higher values indicate a more definitive visual discontinuity.

Interface SceneDescription

Index

Properties

Properties

index

startSec

endSec

durationSec

cutType

description

`Optional` transcript

`Optional` keyFrame

confidence

Settings

Member Visibility

Theme

On This Page

Interface SceneDescription

Index

Properties

Properties

index

startSec

endSec

durationSec

cutType

description

Optional transcript

Optional keyFrame

confidence

Settings

Member Visibility

Theme

On This Page

`Optional` transcript

`Optional` keyFrame