Create a new adapter wrapping a voice-pipeline STT provider.
A configured SpeechToTextProvider instance
(e.g. Whisper, Deepgram, AssemblyAI, Azure Speech).
MIME type to assume for raw audio buffers.
Defaults to 'audio/wav' which is accepted by all major STT
providers. Override to 'audio/mpeg' or 'audio/ogg' when
indexing MP3/OGG files.
If provider is null or undefined.
const adapter = new SpeechProviderAdapter(whisperProvider);
const mp3Adapter = new SpeechProviderAdapter(whisperProvider, 'audio/mpeg');
Transcribe audio data to text.
Wraps the raw buffer in a SpeechAudioInput and delegates to the
underlying voice-pipeline provider. The rich transcription result
is reduced to the plain text string that the multimodal indexer
needs for embedding generation.
Raw audio data as a Buffer (WAV, MP3, OGG, etc.).
Optional language: stringOptional BCP-47 language code hint for improved
transcription accuracy (e.g. 'en', 'es', 'ja').
The transcribed text content.
If the underlying STT provider fails.
const transcript = await adapter.transcribe(wavBuffer);
const spanishTranscript = await adapter.transcribe(audioBuffer, 'es');
Get the display name of the underlying STT provider.
Useful for logging and diagnostics — lets callers identify which voice-pipeline provider is actually handling transcription.
The provider's display name or ID string.
console.log(`STT via: ${adapter.getProviderName()}`); // "openai-whisper"
Bridges the voice-pipeline's
SpeechToTextProviderto the multimodal indexer'sISpeechToTextProviderinterface.Converts raw
Bufferaudio into theSpeechAudioInputshape expected by voice providers, forwards the language hint throughSpeechTranscriptionOptions, and extracts the plain transcript text from the richSpeechTranscriptionResult.Example