Amazon Transcribe

Model Information

A fully managed automatic speech recognition (ASR) service that converts speech into text using deep learning. It supports both batch and streaming transcriptions with features like punctuation, diarization, and language identification.

Model ID

amazon.transcribe

Use this ID when making API calls to reference this model

Provider

Amazon

Model Type

general

Release Date

November 26, 2023

Supported Languages

ab-GEast-ESaz-AZba-RUbe-BYbg-BGbn-INbs-BAca-ESckb-IQckb-IRcs-CZcy-WLel-GRet-ETeu-ESfi-FIgl-ESgu-INha-NGhr-HRhu-HUhy-AMis-ISka-GEkab-DZkk-KZkn-INky-KGlg-INlt-LTlv-LVmhr-RUmi-NZmk-MKml-INmn-MNmr-INmt-MTno-NOor-INpa-INpl-PLps-AFro-ROrw-RWsi-LKsk-SKsl-SIso-SOsr-RSsu-IDsw-BIsw-KEsw-RWsw-TZsw-UGtl-PHtt-RUug-CNuk-UAuz-UZwo-SNzu-ZA

Automatic Language Detection: Yes

Performance & Cost

Cost

$1.44000/hour

$0.00040/second

Maximum Duration

4h 0m

Maximum File Size

2.00 GB

Features

Supported capabilities and functionalities

Core Features

Punctuation

Diarization

Streaming

Speaker Labels

Word Timestamps

Confidence Scores

Custom Vocabulary

Profanity Filtering

Noise Reduction

Voice Activity Detection

Subtitle Formats

SRT Support

VTT Support

Technical Specifications

Input/output formats and technical details

Subtitle Format Support

SRT VTT

Supported Audio Encodings

AMRFLACM4AMP3MP4OggWebMWAVPCM

Supported Sample Rates

8000 Hz16000 Hz48000 Hz