Skip to content

Usage-Based Billing

SynStar AI uses usage-based billing. Credits are deducted only when API requests are successfully processed or when a billable task is submitted.

This means you do not pay a fixed monthly fee by default. Your account balance is consumed based on your actual API usage.

How Usage-Based Billing Works

When you make an API request, SynStar AI calculates the cost based on the selected model, actual usage, model pricing, and the billing rate applied through your Token Group.

For token-based models, the basic formula is:

text
Base Usage Cost =
(Input Tokens / 1,000,000 × Input Price)
+ (Output Tokens / 1,000,000 × Output Price)
+ (Cached Input Tokens / 1,000,000 × Cached Input Price, if applicable)

Then the billing rate is applied:

text
Final Cost = Base Usage Cost × Billing Rate

The final cost is deducted from your account balance.

What Counts as Usage?

Usage may include different types of billable activity depending on the model or endpoint.

Common usage types include:

  • Input tokens
  • Output tokens
  • Cached input tokens
  • Audio tokens
  • Image tokens
  • Embedding tokens
  • Rerank requests
  • Image generation tasks
  • Video generation tasks
  • Audio transcription tasks
  • Text-to-speech tasks
  • Other task-based model usage

For chat and text generation models, billing is usually based on input and output tokens.

For image, video, voice, music, and other task-based APIs, billing may be based on requests, tasks, generated files, duration, or other model-specific units.

Token-Based Billing

For token-based models, your cost is calculated from the number of tokens used in a request.

A request may include:

Usage TypeDescription
Input TokensTokens sent to the model, including prompts, messages, system instructions, and context
Output TokensTokens generated by the model in the response
Cached Input TokensReused input tokens billed at a different price if the model supports cache pricing
Other TokensAudio, image, or multimodal tokens if supported by the selected model

The model pricing page shows the listed price for each model. The listed price is used as the base price before the billing rate is applied.

Billing Rates and Token Groups

Each API token belongs to a Token Group.

The Token Group is linked to a Billing Tier, and the Billing Tier determines the billing rate applied to usage.

Billing TierBilling RateEquivalent Saving
Deep Savings Models0.3 × listed priceUp to 70% off
Claude Models0.5 × listed price50% off
Standard Savings Models0.7 × listed price30% off

The relationship can be understood as:

text
API Token → Token Group → Billing Tier → Billing Rate → Final Deduction

For example, if the base usage cost of a request is $1.00:

Billing RateFinal Deduction
0.3$0.30 credits
0.5$0.50 credits
0.7$0.70 credits

SynStar AI applies the correct billing rate automatically. You do not need to calculate the discount manually.

Example: Token-Based Request

Assume a model has the following listed prices:

Usage TypeListed Price
Input tokens$2.50 / 1M tokens
Output tokens$10.00 / 1M tokens
Cached input tokens$1.25 / 1M tokens

A request uses:

Usage TypeUsage
Input tokens100,000
Output tokens20,000
Cached input tokens0

First, calculate the base usage cost:

text
Input cost = 100,000 / 1,000,000 × $2.50 = $0.25
Output cost = 20,000 / 1,000,000 × $10.00 = $0.20
Cached input cost = 0

Base Usage Cost = $0.25 + $0.20 = $0.45

If this request is settled under a 0.3 billing rate:

text
Final Cost = $0.45 × 0.3 = $0.135

So the system deducts:

text
$0.135 credits

from your account balance.

Task-Based Billing

Some models are not billed only by tokens.

For task-based APIs, the cost may be calculated by task, request, generated output, duration, or another model-specific billing unit.

Examples include:

API TypePossible Billing Unit
Image generationPer image or per task
Video generationPer video, per task, or by duration
Text-to-speechPer character, per token, or per task
Speech-to-textPer audio duration or per task
Music generationPer task or generated clip
RerankPer request or per document unit
EmbeddingsPer input token

For task-based models, the system calculates the base cost according to the displayed model price and then applies the applicable billing rule if a billing rate is used.

The final deducted amount will be shown in your usage records.

Usage Logs

You can view detailed billing records on the Usage Logs page in the Console.

Usage Logs help you understand:

  • Which API token was used
  • Which model was called
  • How many input and output tokens were consumed
  • How much credit was deducted
  • Which Token Group was applied
  • Whether the record was a usage charge or a system record
  • When the request happened
  • Request latency and streaming information

Usage Summary

At the top of the Usage Logs page, you may see summary metrics such as:

MetricDescription
Used QuotaTotal credits consumed within the selected time range
RPMRequests per minute
TPMTokens per minute

These metrics help you monitor your usage volume and spending speed.

Filters

Usage Logs can be filtered by different conditions.

Common filters include:

FilterDescription
Time RangeView usage within a selected date and time range
GroupFilter records by Token Group
Token NameSearch usage by API token name
Model NameSearch usage by model name

You can use these filters to review spending for a specific model, API token, group, or time period.

Usage Log Columns

The usage table may include the following columns:

ColumnDescription
TimeThe time when the usage record was created
TokensThe API token name used for the request
GroupThe Token Group used for settlement
TypeThe record type, such as usage consumption or system adjustment
ModelThe model used for the request
Time / First WordRequest latency and first-token response time
InputNumber of input tokens
OutputNumber of output tokens
SpendFinal credit amount deducted
IPRequest IP address, if available
DetailsPricing details, billing rate, or additional billing information

Important note:

text
In Usage Logs, the “Tokens” column refers to the API token name, not the number of tokens used.

The actual token usage is shown in the Input and Output columns.

Record Type

The Type column helps identify what kind of record it is.

Common record types may include:

TypeMeaning
ConsumeA billable API usage record
SystemA system-generated record, such as credit adjustment, registration reward, or other platform operation

A Consume record usually means credits were deducted due to model usage.

A System record may represent a balance change that was not caused by a normal model request.

Spend

The Spend column shows the final amount deducted from your account balance.

This amount already includes the applied billing rate.

For example, if a request has a base usage cost of $1.00 and the applicable billing rate is 0.3, the Spend column will show the final deducted amount:

text
$0.30

You do not need to apply the multiplier again.

Details

The Details column may show additional billing information, such as:

  • Input price
  • Output price
  • Cached input price
  • Billing rate
  • Group ratio
  • Model-specific pricing information
  • System adjustment details

For example:

text
Input $2.50 / 1M tokens, Group ratio 1

The exact details may vary depending on the model, endpoint, and billing method.

If a model has multiple pricing components, the final deduction may include input price, output price, cache price, or task-based pricing.

Streaming Requests

For streaming requests, Usage Logs may show streaming-related information.

The Time / First Word column may include:

FieldDescription
Total timeTotal request duration
First word / First token timeTime until the first streamed token or first response chunk
StreamIndicates that the request used streaming mode

This is useful for monitoring response speed and debugging latency issues.

Exporting Usage Records

The Usage Logs page may support exporting records.

You can use export records for:

  • Internal billing review
  • Team cost analysis
  • Customer usage reporting
  • Finance reconciliation
  • Debugging abnormal usage
  • Comparing model spending

Before exporting, you can filter by time range, Token Group, token name, or model name to narrow down the records.

Column Settings

The Usage Logs page may provide column settings.

You can adjust visible columns based on what you want to review, such as:

  • Spend
  • Model
  • Input tokens
  • Output tokens
  • Token Group
  • IP address
  • Details

This helps you keep the usage table compact and focused.

Compact List

The Compact List option may display usage records in a more condensed format.

This is useful when you need to review many records quickly or compare multiple requests in a short time range.

Failed Requests

Not every failed request is billable.

In general:

  • If a request fails before reaching the model, it is usually not billed.
  • If a request is processed by the model and returns a billable result, usage may be deducted.
  • If a task-based request is submitted successfully, the task may be billed depending on the endpoint rules.
  • If a billing error occurs, it may be reviewed and adjusted by support.

The exact result depends on the model, endpoint, request status, and whether the model provider processed the request.

Insufficient Balance

If your account balance is insufficient, the request may fail or be rejected before processing.

Common reasons include:

  • Your account balance is too low
  • The selected model requires more credits than your available balance
  • A task-based API requires enough balance before task submission
  • Your token quota has been reached
  • Your account is under payment, risk, or compliance review

To continue using the API, you may need to top up your account or adjust the quota of your API token.

Token Quota and Spending Control

Token quota helps limit how much a specific API token can consume.

This is useful when you want to:

  • Control spending by project
  • Limit usage for testing keys
  • Prevent accidental overspending
  • Assign different budgets to different teams or applications

If a token has a custom quota, it cannot spend more than that quota.

If a token is set to unlimited quota, the token itself has no separate usage cap, but usage is still limited by your account balance.

text
Unlimited quota does not mean unlimited free usage.

Monitoring Best Practices

To manage usage-based billing effectively:

  • Check Usage Logs regularly.
  • Filter by model to identify high-cost models.
  • Filter by token name to track project-level spending.
  • Use Token Groups correctly for the intended Billing Tier.
  • Set token quotas for testing or team-based usage.
  • Review the Spend column to understand actual deductions.
  • Review the Details column when investigating pricing differences.
  • Export usage records for finance or internal reporting.
  • Top up before running large workloads.

Billing Review

If you believe a usage record is incorrect, please contact support with relevant details.

Please include:

  • Your SynStar AI account email
  • Usage record time
  • API token name
  • Model name
  • Endpoint
  • Spend amount
  • Screenshot of the usage record
  • Request ID or error message, if available

Support contact:

text
Email: contact@kkiai.com

If a confirmed billing error occurred, SynStar AI may issue a credit adjustment or other correction according to platform policy.

Summary

Usage-based billing means your credits are deducted according to actual API usage.

In short:

text
Model usage creates a usage record.
Input and output tokens determine the base usage cost for token-based models.
Task-based APIs may use request, task, duration, or output-based pricing.
Token Groups determine the billing rate.
Spend shows the final deducted amount after billing rules are applied.
Usage Logs help you review, filter, export, and monitor all billing records.

The key formula is:

text
Final Cost = Base Usage Cost × Billing Rate

You can track all deductions in the Usage Logs page of the SynStar AI Console.