Speech-to-Text (STT)

Convert audio files to text with support for multiple languages.

Basic STT Conversion

curl -X POST https://api.mondialspeech.com/api/v1/media/stt \
  -H "Authorization: Bearer <ACCESS_TOKEN>" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@sample.mp3" \
  -F "language=en"

Request Parameters

Parameter	Type	Required	Description
`file`	file	Yes	Audio file to convert (IFormFile)
`language`	string	No	Language code (default: "en")

Supported Languages

en: English (default)
fa: Persian/Farsi
ar: Arabic
es: Spanish
fr: French
de: German
it: Italian
pt: Portuguese
ru: Russian
zh: Chinese
ja: Japanese
ko: Korean

File Requirements

Supported formats: MP3, WAV, M4A, OGG
Maximum file size: 25MB
Audio quality: Clear speech recommended

Cost Estimation

curl -X GET "https://api.mondialspeech.com/api/v1/media/estimate/stt?fileSizeBytes=1024000" \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

Response

{
  "text": "Transcribed text from audio file",
  "confidence": 0.95,
  "language": "en",
  "duration": 30.5
}

Error Handling

402 Payment Required (Insufficient Tokens)

{ "error": "Insufficient tokens", "tokens_needed": 500 }

400 Bad Request (Invalid File)

{ "error": "Invalid audio file format" }

413 Payload Too Large

{ "error": "File size exceeds limit" }

Best Practices

Use clear audio for better accuracy
Choose correct language for optimal results
Check file size before upload
Estimate costs for large files

Text-to-Speech

Convert text back to speech

Billing & Plans

Manage tokens and usage

Speech-to-Text

Text-to-Speech

Billing & Plans

On this page