LLM Costs

Fruxon tracks the cost of every LLM call your agents make, giving you full visibility and control over your AI spending.

How Costs Are Calculated

Every time an agent calls an LLM, Fruxon records the token usage and calculates the cost based on the model's pricing. The cost breakdown includes:

Component	Description
Input	Tokens sent to the model (your prompt, context, tools, etc.)
Output	Tokens generated by the model
Cached input	Input tokens served from the provider's prompt cache, charged at a significantly reduced rate
Thinking	Reasoning tokens used by models with extended thinking (e.g., Claude with extended thinking, OpenAI o-series)

Some models have special pricing conditions that are automatically applied:

Long-context pricing — When a request exceeds a model's context threshold (e.g., 200K tokens), higher per-token rates may apply, depending on the provider.
Prompt caching discounts — Tokens served from the provider's cache are charged at 50–90% less than standard input tokens.

You can view costs at multiple levels:

Per execution — See the cost breakdown for each individual agent run in the execution trace.
Per agent revision — View aggregated costs across all executions of a specific agent version.
Per model — See which models are driving your costs within any of the above views.

If you have negotiated rates with an LLM provider, you can override the default pricing in Fruxon to ensure cost reports reflect your actual spend.

Pricing overrides can be set at two levels:

Organization-level — Applies to all agents in your organization.
Agent-level — Applies to a specific agent, overriding the organization-level pricing.

If no override is set, Fruxon uses built-in default pricing that reflects the providers' published rates.

Retroactive pricing

Costs are always calculated using the current pricing. If you update a pricing override, historical cost reports will also reflect the new rate.

You can set a monthly budget on any agent to control spending.

Monthly amount (USD) — The maximum amount the agent should spend per calendar month.
Threshold alerts — Get email notifications when spending reaches certain percentages of the budget (e.g., 50%, 80%, 100%).
Enforce limit — When enabled, the agent will stop accepting new executions once the budget is reached. When disabled, you'll still receive alerts but executions will continue.

Before each execution, Fruxon checks the agent's current monthly spend against its budget.
If enforce limit is on and the budget has been reached, the execution is blocked.
After each execution, Fruxon checks if any notification thresholds have been crossed and sends email alerts to the agent's admins.

Fruxon includes built-in pricing for models from: