Company News

Future AGI May Roundup

Future AGI May Roundup

Future AGI May Roundup

Future AGI May Roundup

Future AGI May Roundup

Future AGI May Roundup

Future AGI May Roundup

Last Updated

May 31, 2025

May 31, 2025

May 31, 2025

May 31, 2025

May 31, 2025

May 31, 2025

May 31, 2025

May 31, 2025

By

Rishav Hada
Rishav Hada
Rishav Hada

Time to read

13 mins

Table of Contents

TABLE OF CONTENTS

At Future AGI, we’re focused on solving the real challenges teams face when working with LLMs - from evaluation and observability to faster iteration and safer deployment. May was a steady step forward, with practical product upgrades, hackathons, webinars, podcasts, and case study showing what better infra and workflows can unlock.

Here’s everything we built, supported, and shared this month.


✅ Product Updates

Introduced Future AGI MCP Server 

We’re excited to announce that Future AGI now runs its own MCP (Model Context Protocol) server, enabling seamless integration with tools like Claude, Cursor, Crew AI, and any other MCP-compatible clients.

With this integration, you can now connect your LLM workflows directly to Future AGI’s evaluation engine - no context switching, no manual uploads.

The MCP protocol standardizes how models, tools, and evaluation layers communicate. By running our own MCP server, we’ve made it easier for teams to:

  • Run inline evaluations during development and testing

  • Automate feedback loops for continuous improvement

  • Debug agent behavior with structured trace-level insights

👉 Ready to democratize AI workflows at your organization? Explore the docs and get started today.


30% Faster Synthetic Data Generation with Improved UI

We’ve rolled out a major update to our Synthetic Data Generation workflow - making it faster, easier to use, and better suited for teams working in regulated environments like finance, healthcare, and legal.

🔧 What’s New:

  • Revamped UI/UX: A cleaner interface now guides users step-by-step with sample examples, making the process more intuitive, even for non-technical users.

  • 30% Faster Generation: We’ve improved system performance to reduce the time it takes to generate synthetic datasets, helping teams move from raw data to training-ready assets faster than ever.

Whether you're fine-tuning models or building robust eval sets, this update helps you ship faster with safer, smarter data.

👉 Get a step-by-step guidance on how to generate synthetic data using Future AGI. 


Improved Trace View with Inline Annotations

Debugging and analyzing LLM behavior just got a whole lot smoother.

Our new trace view is now cleaner, more navigable, and optimized for real-time analysis. Whether you're prototyping or evaluating in production, this update helps you move faster with more clarity.

🔍 What’s New:

  • A streamlined interface for exploring traces and spans.

  • Quick filters to slice metadata like evaluation scores, token usage, and processing times.

  • Ability to add and view inline annotations directly in the trace tree.

With a cleaner trace view and inline annotations, understanding model behavior is faster and more precise - helping you go from prototype to production with confidence.

👉 Dive into our docs to see how prototyping can streamline your LLM development and de-risk production.


50% Faster Dataset Creation for AI Workflows

Creating datasets for LLM experimentation, prompt tuning, or evaluation used to be slow, manual, and error-prone.

Now, it’s automated and up to 50% faster.

With our latest update, you can extract datapoints from traces - including inputs, outputs, latency, evaluation scores, and more - and instantly convert them into structured datasets for analysis or training.

👉 Read the full release notes here 


🌐 In the Field

Case-study: Future AGI Prompt Playground Cuts Problem Resolution Time by 30 % 

What do you do when a 500+ agent team still can’t keep up with thousands of daily support tickets? That was the reality for one global tech company - overwhelmed by repetitive Tier-1 queries, burning out agents, and watching CSAT scores drop fast.

Future AGI’s Prompt Playground flipped the script: by automating 85 % of Tier-1 tickets, optimizing prompts in real time, and streamlining triage, the team cut average resolution time 30%  while freeing agents to tackle 25 % more high-value cases.

Future AGI’s Improve Existing Prompt feature enabled the customer support team to refine prompts in real time based on agent feedback and evolving support needs.

📖 Read the full case-study to see how they did it.


Webinar on "Modern AI Engineering: Strategies That Scale

Too often, AI systems are scaled without the right foundations - leading to high costs, latency issues, and misalignment with business outcomes. As models grow more complex, the need for scalable infrastructure, robust observability, and clear performance tracking becomes non-negotiable.

In this session, we shared a practical playbook for modern AI engineering, featuring Sandeep Kaipu, Engineering Leader at Broadcom, and Nikhil Pareek, Founder at Future AGI. From aligning AI initiatives with real KPIs to designing scalable systems and embedding compliance from day one, we covered the critical steps to take AI from prototype to production, at scale.

Perfect for teams looking to build faster, operate smarter, and deploy AI that performs in the real world.

👉 Watch the full webinar here!


Podcast on "Unlocking Product Management with Reliable AI

We aired another awesome episode of ‘Accelerate AI’ where, Nikhil sat down with Jorge Alcantara, Founder & CEO of Zentrix, for a sharp and honest conversation on how AI is reshaping product management.

In this episode, Jorge breaks down how PMs can move beyond the noise -  automating routine work, debugging agent pipelines, and applying scientific thinking to product strategy.

💡 Key Takeaways:

  • How to design measurable, explainable GenAI products that move beyond surface-level demos

  • Why context-driven product thinking is the key to scaling reliable AI tools

🎧 Tune in here to explore why product managers are becoming Chief Context Officers, and what it takes to build GenAI tools that actually work in the real world.


Co-organised the AWS MCP Agents Hackathon in SF 

Last week, we co-organized the AWS MCP Agents Hackathon alongside an incredible lineup of partners - including Anthropic, DuploCloud, Clarifai, Auth0, Make, n8n, and many more.

Developers from across the world came together to build the next generation of AI agents - with over $50,000 in prizes up for grabs. Participants got exclusive access to Future AGI’s evaluation, optimization, and guardrail toolkit, designed to help teams move fast without compromising trust or reliability.


💡 Closing Thoughts

Everything we worked on this month came back to one idea: making it easier for teams to build AI with confidence and care. Whether through new features, community events, or shared learnings, we’re here to support the people behind the progress.

For more updates, join the conversation in our Slack Community or get in touch with us directly. 

Your partner in building Trustworthy AI!

Table of Contents

Table of Contents

Table of Contents

Rishav Hada is an Applied Scientist at Future AGI, specializing in AI evaluation and observability. Previously at Microsoft Research, he built frameworks for generative AI evaluation and multilingual language technologies. His research, funded by Twitter and Meta, has been published in top AI conferences and earned the Best Paper Award at FAccT’24.

Rishav Hada is an Applied Scientist at Future AGI, specializing in AI evaluation and observability. Previously at Microsoft Research, he built frameworks for generative AI evaluation and multilingual language technologies. His research, funded by Twitter and Meta, has been published in top AI conferences and earned the Best Paper Award at FAccT’24.

Rishav Hada is an Applied Scientist at Future AGI, specializing in AI evaluation and observability. Previously at Microsoft Research, he built frameworks for generative AI evaluation and multilingual language technologies. His research, funded by Twitter and Meta, has been published in top AI conferences and earned the Best Paper Award at FAccT’24.

Related Articles

Related Articles

future agi background
Background image

Ready to deploy Accurate AI?

Book a Demo
Background image

Ready to deploy Accurate AI?

Book a Demo
Background image

Ready to deploy Accurate AI?

Book a Demo
Background image

Ready to deploy Accurate AI?

Book a Demo
Background image

Ready to deploy Accurate AI?

Book a Demo
Background image

Ready to deploy Accurate AI?

Book a Demo
Background image

Ready to deploy Accurate AI?

Book a Demo
Background image

Ready to deploy Accurate AI?

Book a Demo