Skip to main content
Text to Speech

OpenAI Text-to-Speech (legacy)

Generate speech from text using legacy TTS models (tts-1 for lower latency, tts-1-hd for higher quality). Supports 9 built-in voices with adjustable speed and format. No instructions support. Maximum 4096 characters per request.

View details

Inputs

Loading input fields...
Execution Steps

Loading workflow structure...

Loading curated examples...

Overview

OpenAI Text-to-Speech (legacy) turns up to 4,096 characters of text into speech audio using tts-1 or tts-1-hd, a built-in OpenAI voice, speed control, and audio-format selection. Use it for straightforward voiceover drafts, narration, ad reads, and product-demo scripts when you do not need natural-language delivery instructions.

Use cases

  • Generate a quick voiceover draft for a product demo, ad concept, explainer, or internal review.
  • Create MP3, WAV, FLAC, AAC, Opus, or PCM audio for different editing or handoff needs.
  • Compare built-in OpenAI voices before choosing a direction for a campaign asset.
  • Use tts-1 for lower-latency drafts or tts-1-hd when the higher-quality legacy model matters.

Input tips

  • Keep input under 4,096 characters.
  • Choose one built-in voice: alloy, ash, coral, echo, fable, onyx, nova, sage, or shimmer.
  • Select tts-1 for lower latency or tts-1-hd for the higher-quality legacy model.
  • Set speed from 0.25-4.0 only when pacing needs to change.
  • Choose response_format when a specific audio format matters; mp3 is the default.
  • Use OpenAI GPT-4o mini Text-to-Speech when tone, accent, emotion, or delivery instructions matter.

Expected output

The AI Tool returns one generated speech audio file with a downloadable URL, content type, file size, and cost metadata. The output view renders an audio player and shows the OpenAI tts-1 / tts-1-hd provider label, format, and file size.

Caveats

  • This legacy AI Tool does not support natural-language voice instructions.
  • Generated speech should be reviewed for pronunciation, pacing, tone, and brand fit.
  • This AI Tool returns audio only; it does not return timing JSON, transcripts, or voice IDs.
  • Very long scripts need to be split into separate runs because input is capped at 4,096 characters.
  • Built-in voices are fixed options; use another TTS AI Tool when custom or cloned voices are needed.