Class OpenAIWhisperSpeechToTextProvider

Speech-to-text provider that uses the OpenAI Whisper transcription API.

API Contract

Endpoint: POST {baseUrl}/audio/transcriptions
Authentication: Authorization: Bearer <apiKey>
Content-Type: multipart/form-data (FormData with file blob)
Response format: Controlled by the response_format field; defaults to verbose_json which includes segments, language detection, and duration.

Supported Response Formats

verbose_json — Full JSON with segments, duration, and language (default)
json — Minimal JSON with just the text
text — Plain text response (no JSON)
srt — SubRip subtitle format
vtt — WebVTT subtitle format

When text, srt, or vtt format is used, the response is returned as plain text and segments are not available.

See

OpenAIWhisperSpeechToTextProviderConfig for configuration options See normalizeSegments() for the segment normalization logic.

Example

const provider = new OpenAIWhisperSpeechToTextProvider({
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'whisper-1',
});
const result = await provider.transcribe(
  { data: audioBuffer, mimeType: 'audio/wav', fileName: 'recording.wav' },
  { language: 'en', responseFormat: 'verbose_json' },
);

Implements

SpeechToTextProvider

Index

Constructors

constructor

new OpenAIWhisperSpeechToTextProvider(config): OpenAIWhisperSpeechToTextProvider
Creates a new OpenAIWhisperSpeechToTextProvider.
Parameters
- config: OpenAIWhisperSpeechToTextProviderConfig
  Provider configuration including API key and optional defaults.
Returns OpenAIWhisperSpeechToTextProvider
Example
```
const provider = new OpenAIWhisperSpeechToTextProvider({
  apiKey: 'sk-xxxx',
  baseUrl: 'https://api.openai.com/v1', // default
  model: 'whisper-1', // default
});
```
- Defined in src/hearing/providers/OpenAIWhisperSpeechToTextProvider.ts:172

Methods

getProviderName

getProviderName(): string
Returns the human-readable provider name.

Returns string
The display name string 'OpenAI Whisper'.
Example
```
provider.getProviderName(); // 'OpenAI Whisper'
```
Implementation of SpeechToTextProvider.getProviderName
- Defined in src/hearing/providers/OpenAIWhisperSpeechToTextProvider.ts:186

transcribe

transcribe(audio, options?): Promise<SpeechTranscriptionResult>
Transcribes an audio buffer using the OpenAI Whisper API.

The audio is sent as a multipart form upload with the file, model, and optional parameters (language, prompt, temperature, response_format).
Parameters
- audio: SpeechAudioInput
  Raw audio data and metadata. The data buffer is wrapped in a Blob and sent as a form file field. If fileName is not provided, a default name is generated from the format field.
- options: SpeechTranscriptionOptions = {}
  Optional transcription settings including language hint, context prompt, temperature for sampling, and response format.
Returns Promise<SpeechTranscriptionResult>
A promise resolving to the normalized transcription result.
Throws
When the OpenAI API returns a non-2xx status code.

Example
```
const result = await provider.transcribe(
  { data: mp3Buffer, mimeType: 'audio/mpeg', fileName: 'voice.mp3' },
  { language: 'fr', prompt: 'Discussion about AI' },
);
```
Implementation of SpeechToTextProvider.transcribe
- Defined in src/hearing/providers/OpenAIWhisperSpeechToTextProvider.ts:212

Properties

`Readonly` id

id: "openai-whisper" = 'openai-whisper'

Unique provider identifier used for registration and resolution.

`Readonly` displayName

displayName: "OpenAI Whisper" = 'OpenAI Whisper'

Human-readable display name for UI and logging.

`Readonly` supportsStreaming

supportsStreaming: false = false

Whisper API is batch-only; streaming requires a WebSocket adapter.

Class OpenAIWhisperSpeechToTextProvider

API Contract

Supported Response Formats

See

Example

Implements

Index

Constructors

Methods

Properties

Constructors

constructor

Parameters

Returns OpenAIWhisperSpeechToTextProvider

Example

Methods

getProviderName

Returns string

Example

transcribe

Parameters

Returns Promise<SpeechTranscriptionResult>

Throws

Example

Properties

`Readonly` id

`Readonly` displayName

`Readonly` supportsStreaming

Settings

Member Visibility

Theme

On This Page

Class OpenAIWhisperSpeechToTextProvider

API Contract

Supported Response Formats

See

Example

Implements

Index

Constructors

Methods

Properties

Constructors

constructor

Parameters

Returns OpenAIWhisperSpeechToTextProvider

Example

Methods

getProviderName

Returns string

Example

transcribe

Parameters

Returns Promise<SpeechTranscriptionResult>

Throws

Example

Properties

Readonly id

Readonly displayName

Readonly supportsStreaming

Settings

Member Visibility

Theme

On This Page

`Readonly` id

`Readonly` displayName

`Readonly` supportsStreaming