Your agents.
Your metrics. Your boards.

Build custom dashboards with drag-and-drop widgets, 9 chart types, and a visual query builder. Pull from traces, datasets, and simulations. Slice by model, user, project, or any custom attribute. Powered by ClickHouse for sub-second queries on millions of rows.

Agent Performance · 4 widgets
Last 7D + Add Widget
Latency P95
847ms
↓ 12% vs last period
Total Cost
$2,847
↑ 8% vs last period
Eval Scores
Faithfulness Relevance
MonTueWedThuFriSatSun
Tokens by Model
gpt-4o claude gemini
Top Traces by Cost
Trace Model Tokens Cost Latency Eval
QA-Chatbot gpt-4o 12,847 $0.38 2.1s 0.92
DocumentSearch claude-sonnet 8,234 $0.24 1.8s 0.88
SQLQueryEngine gpt-4o-mini 3,412 $0.05 0.9s 0.71
Core Features

Dashboards built for
AI agent teams

Add Widget
Chart Type
Line
Column
Pie
Stacked
Table
Metric
Width
1/41/31/2Full

Build dashboards with line, stacked line, column, stacked column, bar, stacked bar, pie, table, and single-metric KPI cards. Each widget sits on a 12-column responsive grid - resize from quarter-width to full-width, drag to reorder, duplicate with one click.

See chart types

Every widget can pull data from traces (spans, latency, tokens, cost), datasets (row counts, cell errors, eval scores), or simulations (call metrics, persona breakdowns, success rates). Cross-source eval metrics let you join evaluation scores back to any source. One unified query endpoint powered by ClickHouse.

Explore data sources

Break down any metric by project, model, status, provider, service name, span kind, session, user, prompt name, version, label, tags, or custom attributes. Dataset metrics slice by dataset name, column, or annotation template. Simulation metrics slice by scenario, persona gender, age group, location, accent, language, and communication style.

View all dimensions

Aggregate with sum, average, median, min, max, count, count distinct, and percentiles (P25, P50, P75, P90, P95, P99). Dataset columns support pass rate, fail rate, pass count, fail count, and true rate for boolean evaluations. Every aggregation runs server-side in ClickHouse for sub-second response on millions of rows.

See aggregation options
How It Works

From zero to
dashboard in minutes

New Widget Line
Source Traces
Metric latency
Aggregation P95
Breakdown model

Create a dashboard and add widgets

Name your dashboard, then add widgets from 9 chart types. Each widget has a visual query builder - pick a metric, set aggregation, add filters, choose breakdowns. Live preview updates as you configure.

Filters 3 active
model contains gpt-4o
status = OK
latency > 500ms

Query traces, datasets, and simulations

Every widget can pull from any data source - trace spans, dataset evaluations, or simulation runs. Apply filters (contains, greater than, equals) and break down by model, user, project, or any custom attribute.

Dashboard Live
Latency
Cost
Eval Scores (full-width)
CSV Export · Duplicate · Reorder

Arrange, share, and export

Drag widgets to reorder. Resize from quarter-width to full-width on the 12-column grid. Duplicate widgets to iterate fast. Export any chart as CSV for offline analysis.

Powering teams from
prototype to production

From ambitious startups to global enterprises, teams trust Future AGI to ship AI agents confidently.