Speechall Python SDK

The Speechall Python SDK provides a convenient way to integrate our powerful and flexible speech-to-text API into your Python applications. It offers a clean interface, type hints, and is generated from our OpenAPI specification to ensure consistency and completeness.

Note: Comprehensive Documentation on GitHub

This page provides an overview and quick start for the Speechall Python SDK. For the most comprehensive and up-to-date documentation, including detailed API explanations, advanced usage examples, and troubleshooting, please refer to the official SDK README on GitHub.

Features

  • Unified Access: Access various speech-to-text (STT) providers and models through a single, consistent API.
  • Type Hinted: Full support for type hints for all API models and methods, enabling static analysis and better autocompletion.
  • Synchronous & Asynchronous Support: Provides both synchronous and asynchronous clients to fit your application’s needs.
  • File Uploads: Support for transcribing local audio files.
  • Custom Rules: Define and manage custom text replacement rulesets to improve transcription accuracy for specific vocabularies.

Installation

Install the SDK using pip:

pip install speechall

Configuration

First, get an API token from the Speechall Console. It’s recommended to set it as an environment variable.

export SPEECHALL_API_TOKEN='your-token-here'

Then, import and configure the API client in your Python script:

import os
from speechall import ApiClient, Configuration
from speechall.api.speech_to_text_api import SpeechToTextApi

# Get API token from environment variable
api_token = os.getenv("SPEECHALL_API_TOKEN")

if not api_token:
    raise ValueError("SPEECHALL_API_TOKEN environment variable not set.")

# Configure the API client
configuration = Configuration()
configuration.access_token = api_token
# The host is configured by default to 'https://api.speechall.com/v1'

# Create API client
api_client = ApiClient(configuration)
speech_api = SpeechToTextApi(api_client)

Quick Start: Transcribe from URL

Here’s a simple example of how to transcribe an audio file from a URL:

from speechall.models import (
    RemoteTranscriptionConfiguration,
    TranscriptionModelIdentifier,
    TranscriptLanguageCode
)
from speechall.exceptions import ApiException

try:
    config = RemoteTranscriptionConfiguration(
        file_url='https://example.com/audio.mp3', # Replace with your audio URL
        model=TranscriptionModelIdentifier('openai.whisper-1'), # Specify the model
        language=TranscriptLanguageCode.EN, # Optional: specify language
    )

    response = speech_api.transcribe_remote(config)
    print('Transcription:', response.text)

except ApiException as e:
    print('Transcription failed:', e.body)

More Examples & Advanced Usage

Transcribing a Local File

To transcribe a local file, read it in binary mode and pass the data to the transcribe method.

from speechall.models import TranscriptionModelIdentifier
from speechall.exceptions import ApiException
from pathlib import Path

file_path = Path('path/to/your/audio.mp3')

if not file_path.exists():
    print(f"File not found: {file_path}")
else:
    try:
        with open(file_path, "rb") as audio_file:
            # Make transcription request
            result = speech_api.transcribe(
                model=TranscriptionModelIdentifier('assemblyai.best'),
                body=audio_file.read(),
            )
            print("Transcription:", result.text)

    except ApiException as e:
        print(f"API Error: {e.body}")

Advanced Transcription Options

The SDK supports various advanced transcription features like diarization, custom vocabularies, punctuation control, and timestamp granularity.

# Use a model that supports advanced features, like 'gladia.standard'
# ... (inside a try block)

with open(file_path, "rb") as audio_file:
    result = speech_api.transcribe(
        model=TranscriptionModelIdentifier.GLADIA_DOT_STANDARD,
        body=audio_file.read(),
        language=TranscriptLanguageCode.EN,
        diarization=True,  # Speaker identification
        smart_format=True, # Smart formatting for numbers, dates, etc.
        custom_vocabulary=["Speechall", "Python", "SDK"],
        speakers_expected=2,
        timestamp_granularity="word"
    )

    # The result for advanced requests is a detailed object
    import json
    print(json.dumps(result.to_dict(), indent=2))

Listing Available Models

You can fetch a list of all available models, their capabilities, and pricing.

try:
    models = speech_api.list_speech_to_text_models()
    for model in models:
        print(f"- {model.id} ({model.display_name})")
except ApiException as e:
    print(f"Error listing models: {e.body}")

Error Handling

API errors will raise an ApiException. You can catch this exception and inspect the body attribute for specific error messages from the API.

from speechall.exceptions import ApiException

try:
    # ... your API call here
    pass
except ApiException as e:
    print(f"An API error occurred: {e.status} {e.reason}")
    print(f"Details: {e.body}")

Models

The SDK includes Pydantic models for all API requests and responses. This provides type hinting for better developer experience with static analysis and autocompletion, and ensures that all data sent to the API is valid. You can import models directly from speechall.models.

Next Steps

For the most detailed information, advanced use cases, and contributions:

If you have any questions or encounter issues, feel free to open an issue on the GitHub repository or contact our support.