X

Speechall API Overview

Welcome to the Speechall API documentation!

The Speechall REST API provides powerful and flexible speech-to-text capabilities, allowing you to accurately transcribe audio using a variety of leading STT providers and models. Beyond core transcription, the API offers features like custom text replacement rules and compatibility endpoints for easier integration.

Our goal is to provide a unified interface to access diverse STT technologies, simplifying your workflow and giving you the flexibility to choose the best model for your specific needs.

Base URL: https://api.speechall.com/v1 (Use https://api.speechall.com for the latest stable version, currently v1)

API Version: 0.0.1

Key Capabilities:

Flexible Transcription: Convert audio files or streams into text.
Multi-Provider/Model Support: Access transcription capabilities from various providers like OpenAI, Deepgram, AssemblyAI, Google, and more, each with their own strengths and models.
Customizable Output: Get transcriptions as plain text, detailed JSON (with timestamps and segments), or subtitle formats (SRT, VTT).
Advanced Features: Leverage model-specific or Speechall-added features like:
- Automatic Punctuation
- Speaker Diarization (identifying different speakers)
- Word and Segment Timestamps
- Custom Vocabulary (improving recognition for specific terms)
- Initial Prompts (guiding the model)
Text Replacement Rules: Define and apply reusable rulesets to automatically find and replace text in your transcriptions (e.g., redacting sensitive information, correcting common misspellings).
OpenAI Compatibility: Integrate seamlessly with applications or libraries designed for the OpenAI audio API endpoints (/audio/transcriptions, /audio/translations) using Speechall’s models.
Model Discovery: Programmatically list and understand the capabilities of all available speech-to-text models via the API.
- SDKs & Client Libraries: Simplify your integration with our official TypeScript SDK, designed for easy use in JavaScript and Node.js environments. More SDKs are planned for other popular languages.

Core Endpoints:

/transcribe (POST): Directly upload an audio file (raw binary data) for transcription with detailed control over parameters via query strings.
/transcribe-remote (POST): Provide a publicly accessible URL to an audio file for transcription, specifying options in a JSON request body.
/openai-compatible/audio/transcriptions (POST): Transcribe audio using a multipart/form-data request structure compatible with the OpenAI /audio/transcriptions endpoint.
/openai-compatible/audio/translations (POST): Translate non-English audio into English text using a multipart/form-data request structure compatible with the OpenAI /audio/translations endpoint.
/replacement-rulesets (POST): Create custom, reusable sets of text replacement rules.
/speech-to-text-models (GET): Retrieve a list of all available STT models and their supported features.

Authentication:

All requests to the Speechall API must be authenticated. We use a Bearer token scheme, where your API key is sent in the Authorization header.

Policies & Contact:

Terms of Use
Contact Support
License: MIT

Ready to get started? Proceed to the Getting Started guide to make your first API call.