Class PipelineVisionProvider

Adapts the full VisionPipeline to the narrow IVisionProvider interface used by the multimodal indexer.

The pipeline's process() method runs all configured tiers and returns a rich VisionResult. This adapter extracts just the text field that the indexer needs for embedding generation.

For callers that need the full pipeline result (embeddings, layout, confidence, regions), use processWithFullResult() instead.

Example

const provider = new PipelineVisionProvider(pipeline);

// Simple: just the description text
const text = await provider.describeImage(imageUrl);

// Advanced: full pipeline result
const result = await provider.processWithFullResult(imageBuffer);
console.log(result.embedding); // CLIP vector
console.log(result.layout); // Florence-2 layout

Implements

Constructors

Methods

  • Generate a text description of the provided image by running it through the full vision pipeline.

    This satisfies the IVisionProvider contract. The image passes through all configured tiers (OCR, handwriting, document-ai, cloud) and the best extracted text is returned.

    Parameters

    • image: string

      Image as a URL string (https://... or data:image/...).

    Returns Promise<string>

    Text description or extracted content from the image.

    Throws

    If all pipeline tiers fail to produce output.

    Throws

    If the pipeline has been disposed.

    Example

    const description = await provider.describeImage(imageUrl);
    console.log(description);
  • Process an image through the full pipeline and return the complete VisionResult — including embeddings, layout, confidence scores, and per-tier breakdowns.

    Use this when you need more than just the text description (e.g. to store the CLIP embedding alongside the text embedding in the vector store).

    Parameters

    • image: string | Buffer<ArrayBufferLike>

      Image data as a Buffer or URL string.

    Returns Promise<VisionResult>

    Full vision pipeline result.

    Throws

    If all pipeline tiers fail.

    Throws

    If the pipeline has been disposed.

    Example

    const result = await provider.processWithFullResult(imageBuffer);

    // Use both text embedding (via indexer) and image embedding (via CLIP)
    if (result.embedding) {
    await imageVectorStore.upsert('images', [{
    id: docId,
    embedding: result.embedding,
    metadata: { text: result.text },
    }]);
    }