Home / Changelog / 2025 Week 34

Aug 18 – Aug 22, 2025 2025 W34

Summary Dashboards, Alerts Revamp, and Prompt SDK

Rebuilt summary dashboards with rich visualizations, a completely revamped alerts system, and powerful new Prompt SDK capabilities.

Monitor SDK Platform Evaluate

4 Role levels

3 Chart types

What's in this digest

Monitor Summary screen revamp New

Monitor Alerts revamp with Slack and email New

SDK Prompt SDK upgrades New

Platform Workspaces RBAC New

Platform AWS Marketplace integration Improved

SDK Error localizer via SDK Improved

Evaluate Critical issue detection on datasets Improved

Monitor Prompt metrics in Observe Improved

SDK traceAI optional dependencies cleanup Fixed

Summary Screen Revamp

Data without visualization is just noise. The rebuilt summary screen transforms raw evaluation and monitoring data into clear, actionable visual stories. Three new chart types give you multiple perspectives on your agent’s performance.

Spider charts map multi-dimensional evaluation scores onto a single view. See how your agent performs across faithfulness, relevance, completeness, and safety simultaneously. Bar charts break down performance by category, time period, or prompt version. Pie charts show distribution of quality tiers, error types, or evaluation outcomes.

The comparison view is where this gets powerful. Place any two evaluation runs, prompt versions, or time periods side-by-side and see exactly what changed. No more switching between tabs and trying to hold numbers in your head. The visual diff makes regressions obvious and improvements measurable.

Alerts Revamp

The previous alerting system was functional but limited. The revamped alerts infrastructure is built for production teams that need to know the moment quality degrades.

Slack and email notification channels are now first-class citizens. Set thresholds on any metric — hallucination rate exceeds 5%, latency spikes above 2 seconds, evaluation scores drop below baseline — and get notified instantly in the channels your team already uses. Intelligent alert grouping prevents notification fatigue by consolidating related alerts into a single, actionable message.

Alert rules are composable. Combine multiple conditions with AND/OR logic. Set different severity levels. Route critical alerts to PagerDuty while sending informational alerts to a Slack channel. The alerting system adapts to your operational workflow, not the other way around.

Prompt SDK — Production-Grade Prompt Management

Prompts in production need more than version control. The upgraded Prompt SDK introduces three capabilities that teams have been requesting since launch.

Caching eliminates redundant prompt fetches. The SDK caches prompt versions locally with configurable TTL, reducing API calls and improving response times for high-throughput applications. A/B testing is now built into the SDK itself. Define traffic splits between prompt versions and the SDK handles routing automatically, collecting performance data for each variant. Multi-environment deployment lets you promote prompts through development, staging, and production environments with explicit gates between each stage.

Workspaces RBAC

As organizations scale their AI operations, access control becomes critical. The new RBAC framework introduces four distinct role levels: Owner has full administrative control, Admin manages team members and configurations, Member can create and run evaluations and simulations, and Viewer has read-only access to dashboards and results.

Permissions are granular. Control who can modify evaluation criteria, who can trigger simulations, and who can access production traces. For regulated industries, this provides the audit trail and access governance that compliance requires.

Additional Updates

Future AGI is now available on AWS Marketplace, enabling procurement through existing AWS agreements with consolidated billing. The error localizer is available as a standalone SDK function for both sync and async workflows, helping teams pinpoint exactly where in an agent’s execution chain a failure occurred. Datasets now surface critical issues automatically with specific mitigation advice. And traceAI has slimmed down its dependency tree by making framework-specific packages optional, so you only install what you actually use.

Older

Document Intelligence and Async Evaluations

Newer

Agent Compass and Enterprise Security

All changelog entries

Mastering AI Agent Evaluation

The Agentic RAG Playbook

Platform

Audience

LEARN

DEVELOPERS

Featured

Mastering AI Agent Evaluation

The Agentic RAG Playbook

Summary Dashboards, Alerts Revamp, and Prompt SDK

What's in this digest

Summary Screen Revamp

Alerts Revamp

Prompt SDK — Production-Grade Prompt Management

Workspaces RBAC

Additional Updates

Mastering AI Agent Evaluation

The Agentic RAG Playbook

Summary Dashboards, Alerts Revamp, and Prompt SDK

What's in this digest

Summary Screen Revamp

Alerts Revamp

Prompt SDK — Production-Grade Prompt Management

Workspaces RBAC

Additional Updates

FutureAGI AI Assistant