Speechall API Overview
Welcome to the Speechall API documentation!
The Speechall REST API provides powerful and flexible speech-to-text capabilities, allowing you to accurately transcribe audio using a variety of leading STT providers and models. Beyond core transcription, the API offers features like custom text replacement rules and compatibility endpoints for easier integration.
Our goal is to provide a unified interface to access diverse STT technologies, simplifying your workflow and giving you the flexibility to choose the best model for your specific needs.
Base URL: https://api.speechall.com/v1
(Use https://api.speechall.com
for the latest stable version, currently v1)
API Version: 0.0.1
Key Capabilities:
- Flexible Transcription: Convert audio files or streams into text.
- Multi-Provider/Model Support: Access transcription capabilities from various providers like OpenAI, Deepgram, AssemblyAI, Google, and more, each with their own strengths and models.
- Customizable Output: Get transcriptions as plain text, detailed JSON (with timestamps and segments), or subtitle formats (SRT, VTT).
- Advanced Features: Leverage model-specific or Speechall-added features like:
- Automatic Punctuation
- Speaker Diarization (identifying different speakers)
- Word and Segment Timestamps
- Custom Vocabulary (improving recognition for specific terms)
- Initial Prompts (guiding the model)
- Text Replacement Rules: Define and apply reusable rulesets to automatically find and replace text in your transcriptions (e.g., redacting sensitive information, correcting common misspellings).
- OpenAI Compatibility: Integrate seamlessly with applications or libraries designed for the OpenAI audio API endpoints (
/audio/transcriptions
,/audio/translations
) using Speechall’s models. - Model Discovery: Programmatically list and understand the capabilities of all available speech-to-text models via the API.
Core Endpoints:
/transcribe
(POST): Directly upload an audio file (raw binary data) for transcription with detailed control over parameters via query strings./transcribe-remote
(POST): Provide a publicly accessible URL to an audio file for transcription, specifying options in a JSON request body./openai-compatible/audio/transcriptions
(POST): Transcribe audio using amultipart/form-data
request structure compatible with the OpenAI/audio/transcriptions
endpoint./openai-compatible/audio/translations
(POST): Translate non-English audio into English text using amultipart/form-data
request structure compatible with the OpenAI/audio/translations
endpoint./replacement-rulesets
(POST): Create custom, reusable sets of text replacement rules./speech-to-text-models
(GET): Retrieve a list of all available STT models and their supported features.
Authentication:
All requests to the Speechall API must be authenticated. We use a Bearer token scheme, where your API key is sent in the Authorization
header.
Policies & Contact:
- Terms of Use
- Contact Support
- License: MIT
Ready to get started? Proceed to the Getting Started guide to make your first API call.