Amazon Transcribe
A fully managed automatic speech recognition (ASR) service that converts speech ... read more
Amazon $1.44000/hr SRT VTT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for Amazon Transcribe AssemblyAI Best
The Best tier model is optimized for accuracy, low latency, and ease of use. It ... read more
AssemblyAI $0.37000/hr SRT VTT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for AssemblyAI Best AssemblyAI Nano
The Nano tier model is a lightweight, lower cost model for a wide range of use c... read more
AssemblyAI $0.12000/hr SRT VTT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for AssemblyAI Nano AssemblyAI Slam-1
Slam-1 is a Speech Language Model that combines LLM architecture with ASR encode... read more
v1
AssemblyAI $0.37000/hr SRT VTT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for AssemblyAI Slam-1 AssemblyAI Universal-2
Universal-2 is a state-of-the-art model built on Universal-1, offering enhanced ... read more
v2
AssemblyAI $0.37000/hr SRT VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection Confidence
View details for AssemblyAI Universal-2 Azure AI Speech-to-Text
Azure's default, general-purpose speech-to-text model, trained on a vast amount ... read more
Azure $0.18000/hr SRT VTT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for Azure AI Speech-to-Text Cloudflare - Whisper
A general-purpose speech recognition model based on OpenAI's Whisper, trained on... read more
Cloudflare $0.02700/hr VTT
Punctuation Language Detection
View details for Cloudflare - Whisper Cloudflare - Whisper Large V3 Turbo
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech... read more
Cloudflare $0.03060/hr VTT
Punctuation Word Timestamps Language Detection
View details for Cloudflare - Whisper Large V3 Turbo Cloudflare - Whisper Tiny (EN)
This is the English-only version of the Whisper Tiny model which was trained on ... read more
Cloudflare $0.02700/hr VTT
Punctuation Word Timestamps
View details for Cloudflare - Whisper Tiny (EN) Deepgram - Base
Standard base model for speech recognition
v2024-01-26.8851
deepgram $0.75000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Base Deepgram - Base ConversationalAI
Base model optimized for conversational AI applications
deepgram $0.75000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Base ConversationalAI Deepgram - Base Finance
Base model optimized for finance terminology
deepgram $0.75000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Base Finance Deepgram - Base General
Base model for general-purpose transcription
deepgram $0.75000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Base General Deepgram - Base Meeting
Base model optimized for meetings and conferences
deepgram $0.75000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Base Meeting Deepgram - Base Phonecall
Base model optimized for phone conversations
deepgram $0.75000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Base Phonecall Deepgram - Base Video
Base model optimized for video content
deepgram $0.75000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Base Video Deepgram - Base Voicemail
Base model optimized for voicemail transcription
deepgram $0.75000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Base Voicemail Deepgram - Enhanced
Improved accuracy model for speech recognition
deepgram $0.87000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Enhanced Deepgram - Enhanced Finance
Enhanced model optimized for finance terminology
deepgram $0.87000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Enhanced Finance Deepgram - Enhanced General
Enhanced model for general-purpose transcription
deepgram $0.87000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Enhanced General Deepgram - Enhanced Meeting
Enhanced model optimized for meetings and conferences
deepgram $0.87000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Enhanced Meeting Deepgram - Enhanced Phonecall
Enhanced model optimized for phone conversations
deepgram $0.87000/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Enhanced Phonecall Deepgram - Nova
Advanced, high-performance speech recognition model
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Nova Deepgram - Nova 2
High-accuracy, next-generation speech recognition model
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Deepgram - Nova 2 ATC
Nova 2 model optimized for air traffic control
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 ATC Deepgram - Nova 2 Automotive
Nova 2 model optimized for automotive industry
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Automotive Deepgram - Nova 2 ConversationalAI
Nova 2 model optimized for conversational AI
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 ConversationalAI Deepgram - Nova 2 Drivethru
Nova 2 model optimized for drive-through scenarios
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Drivethru Deepgram - Nova 2 Finance
Nova 2 model optimized for finance terminology
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Finance Deepgram - Nova 2 General
Nova 2 model for general-purpose transcription
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 General Deepgram - Nova 2 Medical
Nova 2 model optimized for medical terminology
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Medical Deepgram - Nova 2 Meeting
Nova 2 model optimized for meetings and conferences
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Meeting Deepgram - Nova 2 Phonecall
Nova 2 model optimized for phone conversations
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Phonecall Deepgram - Nova 2 Video
Nova 2 model optimized for video content
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Video Deepgram - Nova 2 Voicemail
Nova 2 model optimized for voicemail transcription
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram - Nova 2 Voicemail Deepgram - Nova General
Nova model for general-purpose transcription
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Nova General Deepgram - Nova Phonecall
Nova model optimized for phone conversations
deepgram $0.25800/hr VTT
Punctuation Diarization Streaming Word Timestamps Language Detection
View details for Deepgram - Nova Phonecall Deepgram - Whisper
OpenAI Whisper model hosted by Deepgram
deepgram $0.28800/hr VTT
Punctuation Diarization Word Timestamps Language Detection
View details for Deepgram - Whisper Deepgram - Whisper Base
Base Whisper model hosted by Deepgram
deepgram $0.21000/hr VTT
Punctuation Diarization Word Timestamps Language Detection
View details for Deepgram - Whisper Base Deepgram - Whisper Large
Large Whisper model hosted by Deepgram
deepgram $0.28800/hr VTT
Punctuation Diarization Word Timestamps Language Detection
View details for Deepgram - Whisper Large Deepgram - Whisper Medium
Medium Whisper model hosted by Deepgram
deepgram $0.25200/hr VTT
Punctuation Diarization Word Timestamps Language Detection
View details for Deepgram - Whisper Medium Deepgram - Whisper Small
Small Whisper model hosted by Deepgram
deepgram $0.22800/hr VTT
Punctuation Diarization Word Timestamps Language Detection
View details for Deepgram - Whisper Small Deepgram - Whisper Tiny
Tiny Whisper model hosted by Deepgram
deepgram $0.19800/hr VTT
Punctuation Diarization Word Timestamps Language Detection
View details for Deepgram - Whisper Tiny Deepgram Nova 3
Great accuracy in a broader range of real-world enterprise use cases and challen... read more
deepgram $0.31200/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram Nova 3 Deepgram Nova 3 General
Nova 3 model for general-purpose transcription
deepgram $0.31200/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram Nova 3 General Deepgram Nova 3 Medical
Nova 3 model optimized for medical terminology
deepgram $0.31200/hr VTT
Punctuation Diarization Streaming Speaker Labels Word Timestamps Language Detection
View details for Deepgram Nova 3 Medical FalAI - Whisper
Whisper model hosted on FalAI platform
v3
FalAI $0.06900/hr None
Punctuation Speaker Labels Word Timestamps
View details for FalAI - Whisper FalAI - Wizper
Optimized version of Whisper for improved performance
v3
FalAI $0.03000/hr None
Punctuation Speaker Labels
View details for FalAI - Wizper FireworksAI - Whisper Turbo V3
Accelerated Whisper V3 model by FireworksAI
FireworksAI $0.05400/hr SRT VTT
Punctuation Diarization Language Detection
View details for FireworksAI - Whisper Turbo V3 FireworksAI - Whisper V3
Whisper V3 model hosted by FireworksAI
FireworksAI $0.09000/hr SRT VTT
Punctuation Diarization Language Detection
View details for FireworksAI - Whisper V3 Gemini 2.0 Flash
Next generation features, speed, thinking, and realtime streaming.
Gemini $0.08532/hr None
Punctuation Language Detection
View details for Gemini 2.0 Flash Gemini 2.0 Flash-Lite
Cost efficiency and low latency
Gemini $0.01215/hr None
Punctuation Language Detection
View details for Gemini 2.0 Flash-Lite Gemini 2.5 Flash Preview 05-20
Adaptive thinking, cost efficiency
Gemini $0.12222/hr None
Punctuation Language Detection
View details for Gemini 2.5 Flash Preview 05-20 Gemini 2.5 Pro Preview
Enhanced thinking and reasoning, general understanding, advanced coding, and mor... read more
Gemini $0.26100/hr None
Punctuation Language Detection
View details for Gemini 2.5 Pro Preview Gladia Solaria
Gladia's cutting-edge, next-generation ASR model, launched in April 2025. Design... read more
v1
Gladia $0.61200/hr VTT
Punctuation Diarization Streaming Speaker Labels Language Detection
View details for Gladia Solaria Google Cloud - Enhanced
Enhanced speech recognition model by Google
google $0.96000/hr SRT VTT
Punctuation Diarization Word Timestamps Language Detection
View details for Google Cloud - Enhanced Google Cloud - Standard
Standard speech recognition model by Google
google $0.96000/hr SRT VTT
Punctuation Diarization Word Timestamps Language Detection
View details for Google Cloud - Standard Groq - Whisper Large V3
A multilingual ASR model offering high accuracy and speed for transcription and ... read more
vv3
groq $0.11100/hr None
Punctuation Word Timestamps Language Detection
View details for Groq - Whisper Large V3 Groq - Whisper Turbo Large V3
A pruned and fine-tuned version of Whisper Large v3, designed for faster and les... read more
vv3 Turbo
groq $0.04000/hr None
Punctuation Word Timestamps Language Detection
View details for Groq - Whisper Turbo Large V3 IBM Watson Speech to Text
A cloud-based speech recognition service from IBM Watson that converts audio int... read more
IBM $1.20000/hr None
Punctuation Diarization Speaker Labels Word Timestamps Confidence
View details for IBM Watson Speech to Text OpenAI - GPT-4o mini Transcribe
Speech-to-text model powered by GPT-4o mini. Offers improvements in word error r... read more
OpenAI $0.18000/hr SRT VTT
Punctuation Streaming Language Detection
View details for OpenAI - GPT-4o mini Transcribe OpenAI - GPT-4o Transcribe
Speech-to-text model powered by GPT-4o. Offers improvements in word error rate, ... read more
OpenAI $0.36000/hr SRT VTT
Punctuation Streaming Language Detection
View details for OpenAI - GPT-4o Transcribe OpenAI - Whisper
General-purpose speech recognition model. Based on the open-source Whisper large... read more
vlarge-v2
OpenAI $0.36000/hr SRT VTT
Punctuation Streaming Word Timestamps Language Detection
View details for OpenAI - Whisper Rev AI Enhanced
Rev AI's high-accuracy general-purpose speech-to-text model, trained on a divers... read more
v2.0
RevAI $0.30000/hr SRT VTT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for Rev AI Enhanced Rev AI Reverb ASR
Rev AI's open-source derived English Automatic Speech Recognition (ASR) model. T... read more
v1.0
RevAI $0.30000/hr SRT VTT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for Rev AI Reverb ASR Speechmatics Enhanced
Speechmatics' Enhanced ASR model offers very good accuracy, though processing is... read more
speechmatics $0.40000/hr SRT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for Speechmatics Enhanced Speechmatics Standard
Speechmatics' Standard ASR model offers faster results with good accuracy.
speechmatics $0.24000/hr SRT
Punctuation Diarization Speaker Labels Word Timestamps Language Detection Confidence
View details for Speechmatics Standard