FruxonDocs

Monitoring

Traces, errors, costs, and metrics for every agent run

Every agent run produces a complete record of what happened — inputs, prompts, tool calls, outputs, tokens, latency, cost, and errors. This page covers how to use that data to understand and improve your agents in production.

Execution History

The Execution History tab on each agent lists every run, with:

ColumnMeaning
Execution IDUnique identifier; shareable link to the full trace
RevisionWhich agent revision served the run
StartedWhen the run began
DurationEnd-to-end latency
CostTotal spend
StatusRun state (in-progress, completed, failed, cancelled, waiting for human)
ChannelWhere the run came from — API, a connector (Slack, Telegram, etc.), test, sub-agent parent
EnvironmentWhich environment the run executed in
SampleWhether the run is a sampled trace

Filters

Narrow the list to find what you care about:

  • Time range — recent runs vs longer windows, or a custom range.
  • Status — successes vs failures; common starting point for triage.
  • Revision — compare behavior across versions.
  • Channel — isolate API traffic from connector traffic.
  • Environment — separate dev / staging / production runs.
  • Search — by execution ID and other identifiers.

The trace

Click any execution to open its trace — the full record of the run. You get:

Timeline

A waterfall view of every step and tool call, ordered chronologically with duration bars. Spot the slow step, the retried tool, the silent timeout.

Step-by-step breakdown

For every node in the workflow:

  • Inputs — what arrived at the step (placeholders resolved)
  • Prompt — the rendered system + user prompts sent to the model
  • Output — the model's full response
  • Tool calls — each call's request, response, and latency
  • Tokens — prompt, completion, total
  • Cost — per step, summed at the run level
  • Provider / model — exactly which LLM served this step

Errors

Failed runs surface the failing step, the error class, the message, and the upstream state — what was in {{input.X}} and previous step outputs at the moment of failure.

Sub-agent traces

When a step calls a sub-agent, the sub-agent's trace is linked inline. Drill in without losing your place.

Tagging

Add tags to runs to group them for follow-up — for example, runs you want to revisit, candidates for your evaluation dataset, or anything that helps you slice the data later. Tags can be applied via the UI or the API.

Metrics

The Overview tab on each agent rolls up trace data into summary metrics across volume, latency, cost, errors, and token consumption.

Organization-wide observability

Organization-level dashboards aggregate across all agents:

  • Settings → Usage — token consumption and spend, by agent, provider, day
  • Settings → Billing — current period total, plan utilization, invoices

Alerts

Three kinds of alerts are available today, all delivered by email:

  • Execution errors — fire when an agent run fails.
  • Budget thresholds — fire when an agent's monthly spend crosses configured percentages of its budget cap. (Cost & Budgets)
  • Access requests — fire when a new chat user requests access through a connector with onboarding enabled.

Retention

Trace data is retained according to your plan. Long-running incident investigations should pull traces before they age out — once retention expires, the trace is gone.

Patterns

  • Triage every morning. Filter to failures in the last 24h, scan error classes, fix the top one.
  • Tag interesting runs daily. Future-you will thank you when building the next golden dataset.
  • Watch p95 latency, not p50. Users feel the tail.
  • Investigate cost outliers. A 10× run usually means a tool retry storm or an unintended loop.

Next steps

On this page