Unique document ID in the vector store.
The text content that was embedded and matched. For images: the vision LLM description. For audio: the STT transcript. For text: the original text chunk.
Cosine similarity score between the query and this result. Higher is more relevant (typically 0.0 to 1.0).
The content modality of this result. Indicates whether the match came from text, image description, or audio transcript.
Optional metadataAny metadata attached during indexing. May include source URLs, file names, timestamps, etc.
A single result from a multimodal search query.
Extends the base vector store result with modality-specific fields so the caller knows what kind of content matched and can render it appropriately.