LLM Costs
Track and control the cost of every LLM call your agents make
Fruxon tracks the cost of every LLM call your agents make, giving you full visibility and control over your AI spending.
How Costs Are Calculated
Every time an agent calls an LLM, Fruxon records the token usage and calculates the cost based on the model's pricing. The cost breakdown includes:
| Component | Description |
|---|---|
| Input | Tokens sent to the model (your prompt, context, tools, etc.) |
| Output | Tokens generated by the model |
| Cached input | Input tokens served from the provider's prompt cache, charged at a significantly reduced rate |
| Thinking | Reasoning tokens used by models with extended thinking (e.g., Claude with extended thinking, OpenAI o-series) |
Pricing Tiers
Some models have special pricing conditions that are automatically applied:
- Long-context pricing — When a request exceeds a model's context threshold (e.g., 200K tokens), higher per-token rates may apply, depending on the provider.
- Prompt caching discounts — Tokens served from the provider's cache are charged at 50–90% less than standard input tokens.
Viewing Costs
You can view costs at multiple levels:
- Per execution — See the cost breakdown for each individual agent run in the execution trace.
- Per agent revision — View aggregated costs across all executions of a specific agent version.
- Per model — See which models are driving your costs within any of the above views.
Custom Pricing
If you have negotiated rates with an LLM provider, you can override the default pricing in Fruxon to ensure cost reports reflect your actual spend.
Pricing overrides can be set at two levels:
- Organization-level — Applies to all agents in your organization.
- Agent-level — Applies to a specific agent, overriding the organization-level pricing.
If no override is set, Fruxon uses built-in default pricing that reflects the providers' published rates.
Retroactive pricing
Costs are always calculated using the current pricing. If you update a pricing override, historical cost reports will also reflect the new rate.
Budgets
You can set a monthly budget on any agent to control spending.
Budget Configuration
- Monthly amount (USD) — The maximum amount the agent should spend per calendar month.
- Threshold alerts — Get email notifications when spending reaches certain percentages of the budget (e.g., 50%, 80%, 100%).
- Enforce limit — When enabled, the agent will stop accepting new executions once the budget is reached. When disabled, you'll still receive alerts but executions will continue.
How It Works
- Before each execution, Fruxon checks the agent's current monthly spend against its budget.
- If enforce limit is on and the budget has been reached, the execution is blocked.
- After each execution, Fruxon checks if any notification thresholds have been crossed and sends email alerts to the agent's admins.
Supported Providers
Fruxon includes built-in pricing for models from:
- OpenAI — GPT-4o, GPT-4.1, GPT-5.x, o-series reasoning models
- Anthropic — Claude Opus, Sonnet, and Haiku families
- Google — Gemini Pro & Flash
- xAI — Grok series
- AWS Bedrock — Amazon Nova series and other models available through Bedrock
Next Steps
- Monitoring - Track agent performance and usage
- Testing - Evaluate agents before production
- Settings - Manage license and usage limits