Free-form textual description / answer from the analyser.
Optional scenesDetected scene segments with timestamps.
Optional objectsDetected objects / entities across the video.
Optional textDetected on-screen or spoken text (OCR / ASR).
Optional durationOverall duration of the analysed video in seconds.
Optional modelModel that produced the analysis.
Optional providerProvider that produced the analysis.
Optional providerProvider-specific metadata.
Structured result from video analysis.