Class OpenAITextToSpeechProvider

Text-to-speech provider that uses the OpenAI TTS API.

API Contract

  • Endpoint: POST {baseUrl}/audio/speech
  • Authentication: Authorization: Bearer <apiKey>
  • Content-Type: application/json
  • Request body: { model, voice, input, response_format, speed }
  • Response: Raw audio bytes in the requested format

Models

  • tts-1 — Optimized for real-time, lower latency, slightly lower quality
  • tts-1-hd — Higher quality at the cost of additional latency

Voice Listing

OpenAI's voice catalog is static (6 voices), so listAvailableVoices() returns a hardcoded list from OPENAI_VOICES without making an API call.

See

Example

const provider = new OpenAITextToSpeechProvider({
apiKey: process.env.OPENAI_API_KEY!,
model: 'tts-1',
voice: 'nova',
});
const result = await provider.synthesize('Hello!', { speed: 1.1 });

Implements

Constructors

Methods

  • Synthesizes speech from text using the OpenAI TTS API.

    Parameters

    • text: string

      The text to convert to audio. Maximum 4096 characters.

    • options: SpeechSynthesisOptions = {}

      Optional synthesis settings including voice, model, output format, and speed (0.25–4.0 range).

    Returns Promise<SpeechSynthesisResult>

    A promise resolving to the audio buffer and metadata.

    Throws

    When the OpenAI API returns a non-2xx status code. Common causes: invalid API key (401), rate limit (429), text too long (400).

    Example

    const result = await provider.synthesize('Hello world', {
    voice: 'alloy',
    speed: 1.2,
    outputFormat: 'opus',
    });
  • Returns the static list of available OpenAI TTS voices.

    Unlike other providers (ElevenLabs, Azure) that require an API call to list voices, OpenAI's voice catalog is fixed and hardcoded. This method returns a shallow copy to prevent external mutation.

    Returns Promise<SpeechVoice[]>

    A promise resolving to the 6 built-in OpenAI voice options.

    Example

    const voices = await provider.listAvailableVoices();
    const defaultVoice = voices.find(v => v.isDefault); // 'nova'

Properties

id: "openai-tts" = 'openai-tts'

Unique provider identifier used for registration and resolution.

displayName: "OpenAI TTS" = 'OpenAI TTS'

Human-readable display name for UI and logging.

supportsStreaming: true = true

Streaming is supported — the OpenAI API streams audio bytes as they are generated, enabling low-latency playback pipelines.