Create a new LLM vision provider.
Provider configuration specifying which LLM to use.
If config.provider is not specified.
const provider = new LLMVisionProvider({
provider: 'anthropic',
model: 'claude-sonnet-4-20250514',
});
Generate a text description of the provided image using a cloud vision LLM.
The image is sent as a base64 data URL in a multimodal message to the configured provider. The LLM's response is returned as-is.
Image as a URL string (https://...) or base64 data URL (data:image/png;base64,...).
Detailed text description of the image content.
If the LLM call fails.
If the LLM returns an empty response.
const description = await provider.describeImage(
'data:image/png;base64,iVBORw0KGgoAAAA...'
);
console.log(description);
// "A golden retriever playing fetch on a sandy beach..."
Vision provider that delegates to a cloud LLM via
generateText().Satisfies the narrow IVisionProvider contract used by the MultimodalIndexer, allowing any vision-capable LLM to serve as the image description backend.
Example