Speechall Python SDK
The Speechall Python SDK provides a convenient way to integrate our powerful and flexible speech-to-text API into your Python applications. It offers a clean interface, type hints, and is generated from our OpenAPI specification to ensure consistency and completeness.
Note: Comprehensive Documentation on GitHub
This page provides an overview and quick start for the Speechall Python SDK. For the most comprehensive and up-to-date documentation, including detailed API explanations, advanced usage examples, and troubleshooting, please refer to the official SDK README on GitHub.
Features
- Unified Access: Access various speech-to-text (STT) providers and models through a single, consistent API.
- Type Hinted: Full support for type hints for all API models and methods, enabling static analysis and better autocompletion.
- Synchronous & Asynchronous Support: Provides both synchronous and asynchronous clients to fit your application’s needs.
- File Uploads: Support for transcribing local audio files.
- Custom Rules: Define and manage custom text replacement rulesets to improve transcription accuracy for specific vocabularies.
Installation
Install the SDK using pip:
pip install speechall
Configuration
First, get an API token from the Speechall Console. It’s recommended to set it as an environment variable.
export SPEECHALL_API_TOKEN='your-token-here'
Then, import and configure the API client in your Python script:
import os
from speechall import ApiClient, Configuration
from speechall.api.speech_to_text_api import SpeechToTextApi
# Get API token from environment variable
api_token = os.getenv("SPEECHALL_API_TOKEN")
if not api_token:
raise ValueError("SPEECHALL_API_TOKEN environment variable not set.")
# Configure the API client
configuration = Configuration()
configuration.access_token = api_token
# The host is configured by default to 'https://api.speechall.com/v1'
# Create API client
api_client = ApiClient(configuration)
speech_api = SpeechToTextApi(api_client)
Quick Start: Transcribe from URL
Here’s a simple example of how to transcribe an audio file from a URL:
from speechall.models import (
RemoteTranscriptionConfiguration,
TranscriptionModelIdentifier,
TranscriptLanguageCode
)
from speechall.exceptions import ApiException
try:
config = RemoteTranscriptionConfiguration(
file_url='https://example.com/audio.mp3', # Replace with your audio URL
model=TranscriptionModelIdentifier('openai.whisper-1'), # Specify the model
language=TranscriptLanguageCode.EN, # Optional: specify language
)
response = speech_api.transcribe_remote(config)
print('Transcription:', response.text)
except ApiException as e:
print('Transcription failed:', e.body)
More Examples & Advanced Usage
Transcribing a Local File
To transcribe a local file, read it in binary mode and pass the data to the transcribe
method.
from speechall.models import TranscriptionModelIdentifier
from speechall.exceptions import ApiException
from pathlib import Path
file_path = Path('path/to/your/audio.mp3')
if not file_path.exists():
print(f"File not found: {file_path}")
else:
try:
with open(file_path, "rb") as audio_file:
# Make transcription request
result = speech_api.transcribe(
model=TranscriptionModelIdentifier('assemblyai.best'),
body=audio_file.read(),
)
print("Transcription:", result.text)
except ApiException as e:
print(f"API Error: {e.body}")
Advanced Transcription Options
The SDK supports various advanced transcription features like diarization, custom vocabularies, punctuation control, and timestamp granularity.
# Use a model that supports advanced features, like 'gladia.standard'
# ... (inside a try block)
with open(file_path, "rb") as audio_file:
result = speech_api.transcribe(
model=TranscriptionModelIdentifier.GLADIA_DOT_STANDARD,
body=audio_file.read(),
language=TranscriptLanguageCode.EN,
diarization=True, # Speaker identification
smart_format=True, # Smart formatting for numbers, dates, etc.
custom_vocabulary=["Speechall", "Python", "SDK"],
speakers_expected=2,
timestamp_granularity="word"
)
# The result for advanced requests is a detailed object
import json
print(json.dumps(result.to_dict(), indent=2))
Listing Available Models
You can fetch a list of all available models, their capabilities, and pricing.
try:
models = speech_api.list_speech_to_text_models()
for model in models:
print(f"- {model.id} ({model.display_name})")
except ApiException as e:
print(f"Error listing models: {e.body}")
Error Handling
API errors will raise an ApiException
. You can catch this exception and inspect the body
attribute for specific error messages from the API.
from speechall.exceptions import ApiException
try:
# ... your API call here
pass
except ApiException as e:
print(f"An API error occurred: {e.status} {e.reason}")
print(f"Details: {e.body}")
Models
The SDK includes Pydantic models for all API requests and responses. This provides type hinting for better developer experience with static analysis and autocompletion, and ensures that all data sent to the API is valid. You can import models directly from speechall.models
.
Next Steps
For the most detailed information, advanced use cases, and contributions:
- Visit the Official SDK GitHub Repository.
- Explore the Speechall Python Example Project.
- Refer to the Main API Reference (OpenAPI) for underlying API details.
If you have any questions or encounter issues, feel free to open an issue on the GitHub repository or contact our support.