MondialMondial AI Docs

Speech-to-Text

Convert audio files to text with language support.

Speech-to-Text (STT)

Convert audio files to text with support for multiple languages.

Basic STT Conversion

curl -X POST https://api.mondialspeech.com/api/v1/media/stt \
  -H "Authorization: Bearer <ACCESS_TOKEN>" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@sample.mp3" \
  -F "language=en"

Request Parameters

ParameterTypeRequiredDescription
filefileYesAudio file to convert (IFormFile)
languagestringNoLanguage code (default: "en")

Supported Languages

  • en: English (default)
  • fa: Persian/Farsi
  • ar: Arabic
  • es: Spanish
  • fr: French
  • de: German
  • it: Italian
  • pt: Portuguese
  • ru: Russian
  • zh: Chinese
  • ja: Japanese
  • ko: Korean

File Requirements

  • Supported formats: MP3, WAV, M4A, OGG
  • Maximum file size: 25MB
  • Audio quality: Clear speech recommended

Cost Estimation

curl -X GET "https://api.mondialspeech.com/api/v1/media/estimate/stt?fileSizeBytes=1024000" \
  -H "Authorization: Bearer <ACCESS_TOKEN>"

Response

{
  "text": "Transcribed text from audio file",
  "confidence": 0.95,
  "language": "en",
  "duration": 30.5
}

Error Handling

402 Payment Required (Insufficient Tokens)

{ "error": "Insufficient tokens", "tokens_needed": 500 }

400 Bad Request (Invalid File)

{ "error": "Invalid audio file format" }

413 Payload Too Large

{ "error": "File size exceeds limit" }

Best Practices

  1. Use clear audio for better accuracy
  2. Choose correct language for optimal results
  3. Check file size before upload
  4. Estimate costs for large files