Interface IVisionProvider

Minimal interface for a vision LLM that can describe images.

This is kept intentionally narrow to avoid coupling the multimodal indexer to a specific LLM provider. Any service that can take an image and return a text description satisfies this contract.

Example

const visionProvider: IVisionProvider = {
  describeImage: async (image) => {
    const response = await openai.chat.completions.create({
      model: 'gpt-4o',
      messages: [{ role: 'user', content: [
        { type: 'text', text: 'Describe this image in detail.' },
        { type: 'image_url', image_url: { url: imageUrl } },
      ]}],
    });
    return response.choices[0].message.content!;
  },
};

interface IVisionProvider {
describeImage(image): Promise<string>;
}

Implemented by

Index

Methods

describeImage

Methods

describeImage

describeImage(image): Promise<string>
Generate a text description of the provided image.
Parameters
- image: string
  Image as a URL string or base64 data URL.
Returns Promise<string>
A detailed text description of the image content.
- Defined in src/rag/multimodal/types.ts:284

Interface IVisionProvider

Example

Implemented by

Index

Methods

Methods

describeImage

Parameters

Returns Promise<string>

Settings

Member Visibility

Theme

On This Page