OpenAI - GPT-4o Transcribe

Model Information

Speech-to-text model powered by GPT-4o. Offers improvements in word error rate, language recognition, and accuracy compared to original Whisper models.

Model ID

openai.gpt-4o-transcribe

Use this ID when making API calls to reference this model

Provider

OpenAI

Model Type

general

Accuracy Tier

premium

Release Date

March 20, 2025

Supported Languages

afarhyazbebsbgcazhhrcsdanlenetfifrgldeelhehihuisiditjaknkkkolvltmkmsmrminenofaplptrorusrskslesswsvtltathtrukurvicy

Automatic Language Detection: Yes

Streaming Transcription Languages

afarhyazbebsbgcazhhrcsdanlenetfifrgldeelhehihuisiditjaknkkkolvltmkmsmrminenofaplptrorusrskslesswsvtltathtrukurvicy

Performance & Cost

Cost

$0.36000/hour

$0.00010/second

Maximum File Size

25.00 MB

Features

Supported capabilities and functionalities

Core Features

Punctuation

Diarization

Streaming

Speaker Labels

Word Timestamps

Confidence Scores

Custom Vocabulary

Profanity Filtering

Noise Reduction

Voice Activity Detection

Subtitle Formats

SRT Support

VTT Support

Technical Specifications

Input/output formats and technical details

Subtitle Format Support

SRT VTT

Supported Audio Encodings

M4AMP3WebMMP4MPGAWAVMPEG