Audio APIs provides two speech to text endpoints, transcriptions and translations
Derived from: EndPoint
OperationsGenerates audio from the input text
function : Speak(model:String, input:String, voice:String, token:String) ~ Pair<String,ByteArrayRef>
Name | Type | Description |
---|---|---|
model | String | available TTS models 'tts-1' or 'tts-1-hd' |
input | String | the text to generate audio for, the maximum length is 4096 characters |
voice | String | voice to use when generating the audio, supported voices are alloy, 'echo', 'fable', 'onyx', 'nova', and 'shimmer'. |
token | String | API token |
Type | Description |
---|---|
Pair<String,ByteArrayRef> | response with type and content, Nil if unsuccessful |
Generates audio from the input text
function : Speak(model:String, input:String, voice:String, response_format:String, token:String) ~ Pair<String,ByteArrayRef>
Name | Type | Description |
---|---|---|
model | String | available TTS models 'tts-1' or 'tts-1-hd' |
input | String | the text to generate audio for, the maximum length is 4096 characters |
voice | String | voice to use when generating the audio. Supported voices are alloy, 'echo', 'fable', 'onyx', 'nova', and 'shimmer'. |
response_format | String | format to audio in, supported formats are 'mp3', 'opus', 'aac', 'flac', 'wav', and 'pcm'. |
token | String | API token |
Type | Description |
---|---|
Pair<String,ByteArrayRef> | response with type and content, Nil if unsuccessful |
Generates audio from the input text
function : Speak(model:String, input:String, voice:String, response_format:String, speed:Float, token:String) ~ Pair<String,ByteArrayRef>
Name | Type | Description |
---|---|---|
model | String | available TTS models 'tts-1' or 'tts-1-hd' |
input | String | the text to generate audio for, the maximum length is 4096 characters |
voice | String | voice to use when generating the audio. Supported voices are alloy, 'echo', 'fable', 'onyx', 'nova', and 'shimmer'. |
response_format | String | format to audio in, supported formats are 'mp3', 'opus', 'aac', 'flac', 'wav', and 'pcm'. |
speed | Float | speed of the generated audio 0.25 to 4.0. 1.0 is the default. |
token | String | API token |
Type | Description |
---|---|
Pair<String,ByteArrayRef> | response with type and content, Nil if unsuccessful |