Interface ExtractedImage

An image extracted from a document during ingestion.

interface ExtractedImage {
    data: Buffer<ArrayBufferLike>;
    mimeType: string;
    caption?: string;
    pageNumber?: number;
    embedding?: number[];
}

Properties

data: Buffer<ArrayBufferLike>

Raw image bytes (PNG, JPEG, WebP, etc.).

mimeType: string

MIME type of data.

Example

'image/png' | 'image/jpeg'
caption?: string

Auto-generated or OCR-derived caption. Present when a vision LLM is configured and extractImages: true.

pageNumber?: number

Page number the image appears on (1-based, PDF/DOCX).

embedding?: number[]

Dense embedding of the image caption or visual content. Only present when embeddings were computed during extraction.