Cloudflare - Whisper

A general-purpose speech recognition model based on OpenAI's Whisper, trained on a large dataset of diverse audio. It can perform multilingual speech recognition, speech translation, and language identification. Cloudflare Workers AI provides access to this model.

Provider

Cloudflare

Model Type

general

Supported Languages

afarhyazbebsbgcazhhrcsdanlenetfifrgldeelhehihuisiditjaknkkkolvltmkmsmrminenofaplptrorusrskslesswsvtltathtrukurvicy
Performance & Cost

Cost

$0.02700/hour

$0.00001/second

Maximum File Size

25.00 MB

Features

Supported capabilities and functionalities

Core Features

Punctuation
Diarization
Streaming
Speaker Labels
Word Timestamps
Confidence Scores
Language Detection
Custom Vocabulary
Profanity Filtering
Noise Reduction
Technical Specifications

Input/output formats and technical details

Supported Output Formats

textvttjson

Supported Audio Encodings

MP3MP4WAVWebMMPEGMPGAM4A