Class AssemblyAISTTProvider

Speech-to-text provider that uses the AssemblyAI async transcription API.

Three-Step Workflow

AssemblyAI uses an asynchronous transcription pipeline that requires three sequential HTTP requests:

Upload — POST /v2/upload sends the raw audio bytes to AssemblyAI's CDN and returns an upload_url. This step is necessary because the transcript endpoint accepts URLs, not raw audio.
Submit — POST /v2/transcript creates a transcription job referencing the upload URL. Returns a transcript id used for polling. Optional features like speaker_labels are enabled in this request's JSON body.
Poll — GET /v2/transcript/:id is called every POLL_INTERVAL_MS (1 second) until the transcript status transitions to 'completed' or 'error'. The polling loop is bounded by DEFAULT_TIMEOUT_MS (120 seconds) to prevent indefinite waiting.

An optional AbortSignal can be passed via options.providerSpecificOptions.signal to cancel the transcription at any point. The signal is forwarded to all three fetch calls and also checked at the top of each polling iteration. When aborted, an error is thrown immediately without waiting for the current fetch to complete.

Error Handling

Non-2xx responses at any step throw an Error with the HTTP status and body.
status === 'error' on the transcript throws with AssemblyAI's error message.
Timeout expiry throws with the transcript ID for manual inspection.
Aborted signals throw with a descriptive cancellation message.

See

AssemblyAISTTProviderConfig for configuration options See AssemblyAITranscript for the polling response shape.

Example

const provider = new AssemblyAISTTProvider({
  apiKey: process.env.ASSEMBLYAI_API_KEY!,
});

// Basic transcription
const result = await provider.transcribe({ data: audioBuffer });

// With diarization and cancellation support
const controller = new AbortController();
const result = await provider.transcribe(
  { data: audioBuffer },
  {
    enableSpeakerDiarization: true,
    providerSpecificOptions: { signal: controller.signal },
  },
);

Implements

SpeechToTextProvider

Index

Constructors

constructor

new AssemblyAISTTProvider(config): AssemblyAISTTProvider
Creates a new AssemblyAISTTProvider.
Parameters
- config: AssemblyAISTTProviderConfig
  Provider configuration including the API key.
Returns AssemblyAISTTProvider
Example
```
const provider = new AssemblyAISTTProvider({
  apiKey: 'your-assemblyai-api-key',
});
```
- Defined in src/hearing/providers/AssemblyAISTTProvider.ts:231

Methods

getProviderName

getProviderName(): string
Returns the human-readable provider name.

Returns string
The display name string 'AssemblyAI'.
Example
```
provider.getProviderName(); // 'AssemblyAI'
```
Implementation of SpeechToTextProvider.getProviderName
- Defined in src/hearing/providers/AssemblyAISTTProvider.ts:245

transcribe

transcribe(audio, options?): Promise<SpeechTranscriptionResult>
Transcribes an audio buffer via the AssemblyAI three-step async pipeline: upload, submit, and poll.
Parameters
- audio: SpeechAudioInput
  Raw audio data and associated metadata. The data buffer is uploaded to AssemblyAI's CDN in step 1.
- options: SpeechTranscriptionOptions = {}
  Optional transcription settings. Pass providerSpecificOptions.signal (an AbortSignal) to cancel at any point in the pipeline.
Returns Promise<SpeechTranscriptionResult>
A promise resolving to the normalized transcription result.
Throws
When the upload API returns a non-2xx status.

Throws
When the transcript submit API returns a non-2xx status.

Throws
When the polling API returns a non-2xx status.

Throws
When the transcript status becomes 'error' (includes AssemblyAI's error message, e.g. "Audio file could not be decoded").

Throws
When the 120-second timeout is exceeded (includes the transcript ID for manual inspection via the AssemblyAI dashboard).

Throws
When the caller's AbortSignal is triggered.

Example
```
const result = await provider.transcribe(
  { data: wavBuffer, mimeType: 'audio/wav' },
  { enableSpeakerDiarization: true, language: 'en' },
);
console.log(result.text);
console.log(result.segments?.map(s => `[${s.speaker}] ${s.text}`));
```
Implementation of SpeechToTextProvider.transcribe
- Defined in src/hearing/providers/AssemblyAISTTProvider.ts:278

Properties

`Readonly` id

id: "assemblyai" = 'assemblyai'

Unique provider identifier used for registration and resolution.

`Readonly` displayName

displayName: "AssemblyAI" = 'AssemblyAI'

Human-readable display name for UI and logging.

`Readonly` supportsStreaming

supportsStreaming: false = false

Streaming is not supported by this provider's async pipeline. AssemblyAI does offer a separate real-time streaming API via WebSocket, but that would be a different provider implementation.

Class AssemblyAISTTProvider

Three-Step Workflow

AbortController Usage

Error Handling

See

Example

Implements

Index

Constructors

Methods

Properties

Constructors

constructor

Parameters

Returns AssemblyAISTTProvider

Example

Methods

getProviderName

Returns string

Example

transcribe

Parameters

Returns Promise<SpeechTranscriptionResult>

Throws

Throws

Throws

Throws

Throws

Throws

Example

Properties

`Readonly` id

`Readonly` displayName

`Readonly` supportsStreaming

Settings

Member Visibility

Theme

On This Page

Class AssemblyAISTTProvider

Three-Step Workflow

AbortController Usage

Error Handling

See

Example

Implements

Index

Constructors

Methods

Properties

Constructors

constructor

Parameters

Returns AssemblyAISTTProvider

Example

Methods

getProviderName

Returns string

Example

transcribe

Parameters

Returns Promise<SpeechTranscriptionResult>

Throws

Throws

Throws

Throws

Throws

Throws

Example

Properties

Readonly id

Readonly displayName

Readonly supportsStreaming

Settings

Member Visibility

Theme

On This Page

`Readonly` id

`Readonly` displayName

`Readonly` supportsStreaming