Skip to content

Create Chat Completion DeepSeek v3.1 Thinking Degree (Streaming)

POST https://www.kkiai.com/v1/chat/completions

Request Parameters

Authorization

Add the Authorization parameter to the Header. Its value is the Token concatenated after Bearer.

Example: Authorization: Bearer ********************

Header Parameters

Parameter NameTypeRequiredDescriptionExample
Content-TypestringRequiredapplication/json
AcceptstringRequiredapplication/json
AuthorizationstringOptionalBearer {{YOUR_API_KEY}}
X-Forwarded-HoststringOptionallocalhost:5173

Body Parameters (application/json)

Parameter NameTypeRequiredDescription
modelstringRequiredThe ID of the model to use.
max_tokensintegerOptionalLimit the maximum number of tokens the model can generate in a completion for a single request. The total length of input tokens and output tokens is limited by the model's context length.
messagesarray[object]Required
  └ rolestringRequired
  └ contentstringRequired
temperatureintegerOptionalWhat sampling temperature to use, between 0 and 2. Higher values (like 0.8) will make the output more random, while lower values (like 0.2) will make the output more focused and deterministic.
streambooleanOptionalIf set to True, messages will be sent as stream increments in the form of SSE (server-sent events). The message stream ends with data: [DONE].
stream_optionsobjectOptionalOptions related to streaming output. This parameter can only be set when the stream parameter is true.
  └ include_usagebooleanOptionalIf set to true, an additional chunk will be transmitted before data: [DONE] at the end of the streaming message. The usage field on this chunk shows token usage statistics for the entire request, while the choices field will always be an empty array. All other chunks will also contain a usage field, but its value will be null.
thinkingobjectOptionalSome models with deep thinking capabilities support controlling whether to disable deep thinking through the thinking field.
  └ typestringOptionalenabled: Default, forcibly enables deep thinking capability. disabled: Forcibly disables deep thinking capability. auto: The model determines whether to perform deep thinking.

Request Example

json
{
  "model": "deepseek-v3-1-250821",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  },
     "thinking":{
         "type":"enabled"
     }
}

cURL Example

bash
curl --location --request POST 'https://www.kkiai.com/v1/chat/completions' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data-raw '{
  "model": "deepseek-v3-1-250821",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  },
     "thinking":{
         "type":"enabled"
     }
}'

Response

🟢 200 OK

Response Body

Parameter NameTypeRequiredDescription
idstringRequired
objectstringRequired
createdintegerRequired
choicesarray[object]Required
  └ indexintegerOptional
  └ messageobjectOptional
  └ finish_reasonstringOptional
usageobjectRequired
  └ prompt_tokensintegerRequired
  └ completion_tokensintegerRequired
  └ total_tokensintegerRequired

Response Example

json
{
    "id": "chatcmpl-123",
    "object": "chat.completion",
    "created": 1677652288,
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "\n\nHello there, how may I assist you today?"
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 12,
        "total_tokens": 21
    }
}