Class SpeechProviderAdapter

Bridges the voice-pipeline's SpeechToTextProvider to the multimodal indexer's ISpeechToTextProvider interface.

Converts raw Buffer audio into the SpeechAudioInput shape expected by voice providers, forwards the language hint through SpeechTranscriptionOptions, and extracts the plain transcript text from the rich SpeechTranscriptionResult.

Example

const whisper = resolver.resolveSTT();
const adapted = new SpeechProviderAdapter(whisper);

// Now usable by the multimodal indexer:
const text = await adapted.transcribe(audioBuffer, 'en');

Implements

ISpeechToTextProvider

Index

Constructors

constructor

Methods

transcribe getProviderName

Constructors

constructor

new SpeechProviderAdapter(provider, defaultMimeType?): SpeechProviderAdapter
Create a new adapter wrapping a voice-pipeline STT provider.
Parameters
- provider: SpeechToTextProvider
  A configured SpeechToTextProvider instance (e.g. Whisper, Deepgram, AssemblyAI, Azure Speech).
- defaultMimeType: string = 'audio/wav'
  MIME type to assume for raw audio buffers. Defaults to 'audio/wav' which is accepted by all major STT providers. Override to 'audio/mpeg' or 'audio/ogg' when indexing MP3/OGG files.
Returns SpeechProviderAdapter
Throws
If provider is null or undefined.

Example
```
const adapter = new SpeechProviderAdapter(whisperProvider);
const mp3Adapter = new SpeechProviderAdapter(whisperProvider, 'audio/mpeg');
```
- Defined in src/rag/multimodal/SpeechProviderAdapter.ts:109

Methods

transcribe

transcribe(audio, language?): Promise<string>
Transcribe audio data to text.

Wraps the raw buffer in a SpeechAudioInput and delegates to the underlying voice-pipeline provider. The rich transcription result is reduced to the plain text string that the multimodal indexer needs for embedding generation.
Parameters
- audio: Buffer<ArrayBufferLike>
  Raw audio data as a Buffer (WAV, MP3, OGG, etc.).
- Optional language: string
  Optional BCP-47 language code hint for improved transcription accuracy (e.g. 'en', 'es', 'ja').
Returns Promise<string>
The transcribed text content.
Throws
If the underlying STT provider fails.

Example
```
const transcript = await adapter.transcribe(wavBuffer);
const spanishTranscript = await adapter.transcribe(audioBuffer, 'es');
```
Implementation of ISpeechToTextProvider.transcribe
- Defined in src/rag/multimodal/SpeechProviderAdapter.ts:140

getProviderName

getProviderName(): string
Get the display name of the underlying STT provider.

Useful for logging and diagnostics — lets callers identify which voice-pipeline provider is actually handling transcription.

Returns string
The provider's display name or ID string.
Example
```
console.log(`STT via: ${adapter.getProviderName()}`); // "openai-whisper"
```
- Defined in src/rag/multimodal/SpeechProviderAdapter.ts:164

Class SpeechProviderAdapter

Example

Implements

Index

Constructors

Methods

Constructors

constructor

Parameters

Returns SpeechProviderAdapter

Throws

Example

Methods

transcribe

Parameters

Returns Promise<string>

Throws

Example

getProviderName

Returns string

Example

Settings

Member Visibility

Theme

On This Page