Theme
Text-to-Speech
POST
https://www.kkiai.com/ent/v2/audio-tts
Official Documentation: https://platform.vidu.cn/docs/text-to-speech
Request Parameters
Authorization
Add the Authorization parameter to the Header. Its value should be the Token concatenated after Bearer.
Example: Authorization: Bearer ********************
Header Parameters
| Parameter Name | Type | Required | Description | Example |
|---|---|---|---|---|
Authorization | string | Optional | Bearer {{YOUR_API_KEY}} | |
Content-Type | string | Optional | application/json |
Body Parameters (application/json)
| Parameter Name | Type | Required | Description |
|---|---|---|---|
text | string | Required | Text to be synthesized into speech 1. Length limit: less than 10000 characters 2. Paragraph breaks are marked with line breaks 3. Pause control: supports custom time intervals between text for speech pause effects. * Usage: add <#x#> markers in the text, where x is the pause duration in seconds, range [0.01, 99.99], with a maximum of two decimal places. The interval should be set between two pronounceable text segments and cannot use multiple pause markers consecutively * Example: Hello<#2#>I am vidu<#2#>nice to meet you! |
voice_setting_voice_id | string | Required | Voice ID for synthesized audio. View the voice list to query all available voices: https://shengshu.feishu.cn/sheets/EgFvs6DShhiEBStmjzccr5gonOg |
voice_setting_speed | string | Optional | Speech speed, default is 1.0. 1.0 is normal speed, range [0.5,2]. At 0.5 the speech is slowest, at 2 the speech is fastest |
voice_setting_volume | string | Optional | Volume level. Range 0 - 10, default is 0, representing normal volume. Higher values mean higher volume |
voice_setting_pitch | string | Optional | Pitch of synthesized audio. Range [-12,12], default 0. 0 represents the original voice output |
voice_setting_emotion | string | Optional | Controls the emotion of synthesized speech 1. Parameter range ["happy", "sad", "angry", "fearful", "disgusted", "surprised", "calm"], corresponding to 7 emotions: happy, sad, angry, fearful, disgusted, surprised, neutral 2. The model automatically matches appropriate emotions based on input text, manual specification is generally not needed |
pronunciation_dict_tone | string | Optional | Define pronunciation of polyphonic characters. Define phonetic annotations or pronunciation replacement rules for specific characters or symbols that need special marking. For polyphonic characters in Chinese text, tones are represented by numbers: first tone is 1; second tone is 2; third tone is 3; fourth tone is 4; neutral tone is 5. Examples: ["燕少飞/(yan4)(shao3)(fei1)", "达菲/(da2)(fei1)", "omg/oh my god"] |
payload | string | Optional | Pass-through parameter. No processing is performed, only data transmission. Note: maximum 1048576 characters |
Request Example
json
{
"text": "Artificial intelligence is changing the way we live, from smart homes to autonomous driving. Advances in technology are making the world more convenient.",
"voice_setting_voice_id": "male-qn-daxuesheng"
}cURL Example
bash
curl --location --request POST 'https://www.kkiai.com/ent/v2/audio-tts' \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"text": "Artificial intelligence is changing the way we live, from smart homes to autonomous driving. Advances in technology are making the world more convenient.",
"voice_setting_voice_id": "male-qn-daxuesheng"
}'Response
🟢 200 Success
Response Body
| Parameter Name | Type | Required | Description |
|---|---|---|---|
task_id | string | Required | |
state | string | Required | |
model | string | Required | |
prompt | string | Required | |
duration | integer | Required | |
seed | integer | Required | |
created_at | string | Required | |
credits | integer | Required |
Response Example
json
{
"task_id": "911094612548939776",
"state": "created",
"model": "audio1.0",
"prompt": "The sound of raindrops falling on a window, accompanied by soft thunder.",
"duration": 5,
"seed": 0,
"created_at": "2026-01-20T07:16:38.094635957Z",
"credits": 10
}