Speechall API Overview

Welcome to the Speechall API documentation!

The Speechall REST API provides powerful and flexible speech-to-text capabilities, allowing you to accurately transcribe audio using a variety of leading STT providers and models. Beyond core transcription, the API offers features like custom text replacement rules and compatibility endpoints for easier integration.

Our goal is to provide a unified interface to access diverse STT technologies, simplifying your workflow and giving you the flexibility to choose the best model for your specific needs.

Base URL: https://api.speechall.com/v1 (Use https://api.speechall.com for the latest stable version, currently v1)

API Version: 0.0.1

Key Capabilities:

  • Flexible Transcription: Convert audio files or streams into text.
  • Multi-Provider/Model Support: Access transcription capabilities from various providers like OpenAI, Deepgram, AssemblyAI, Google, and more, each with their own strengths and models.
  • Customizable Output: Get transcriptions as plain text, detailed JSON (with timestamps and segments), or subtitle formats (SRT, VTT).
  • Advanced Features: Leverage model-specific or Speechall-added features like:
    • Automatic Punctuation
    • Speaker Diarization (identifying different speakers)
    • Word and Segment Timestamps
    • Custom Vocabulary (improving recognition for specific terms)
    • Initial Prompts (guiding the model)
  • Text Replacement Rules: Define and apply reusable rulesets to automatically find and replace text in your transcriptions (e.g., redacting sensitive information, correcting common misspellings).
  • OpenAI Compatibility: Integrate seamlessly with applications or libraries designed for the OpenAI audio API endpoints (/audio/transcriptions, /audio/translations) using Speechall’s models.
  • Model Discovery: Programmatically list and understand the capabilities of all available speech-to-text models via the API.

Core Endpoints:

  • /transcribe (POST): Directly upload an audio file (raw binary data) for transcription with detailed control over parameters via query strings.
  • /transcribe-remote (POST): Provide a publicly accessible URL to an audio file for transcription, specifying options in a JSON request body.
  • /openai-compatible/audio/transcriptions (POST): Transcribe audio using a multipart/form-data request structure compatible with the OpenAI /audio/transcriptions endpoint.
  • /openai-compatible/audio/translations (POST): Translate non-English audio into English text using a multipart/form-data request structure compatible with the OpenAI /audio/translations endpoint.
  • /replacement-rulesets (POST): Create custom, reusable sets of text replacement rules.
  • /speech-to-text-models (GET): Retrieve a list of all available STT models and their supported features.

Authentication:

All requests to the Speechall API must be authenticated. We use a Bearer token scheme, where your API key is sent in the Authorization header.

Policies & Contact:

Ready to get started? Proceed to the Getting Started guide to make your first API call.