How to Find the Agent Failures Your Evals Miss with Scott Clark

EPISODE 767

MAY 7, 2026

Watch

Facebook

About this Episode

In this episode, Scott Clark, co-founder and CEO of Distributional, joins us to explore how teams can reliably operate and improve complex LLM systems and agents in production. Scott introduces a Maslow’s hierarchy of observability: telemetry for logging, monitoring for known signals, and post-production or online analytics to surface unknown unknowns. We dig into examples of real-world failures Scott’s team has seen in production systems, such as “lazy” tool-use hallucinations that standard evals miss, and how mapping traces into vector fingerprints enables clustering and topic discovery to uncover emergent behaviors. Scott explains how analytics can feed the data flywheel by generating evals, guardrails, and training data, and why online, adaptive approaches are essential for non-stationary models. We also touch on practical how-to’s such as instrumentation with OpenTelemetry, the GenAI semantic conventions, and the role of dedicated analytics tools.

About the Guest

Scott Clark

Distributional

Connect with Scott

How to Find the Agent Failures Your Evals Miss with Scott Clark

About this Episode

About the Guest

Scott Clark

Resources

Related Topics

How to Find the Agent Failures Your Evals Miss with Scott Clark

About this Episode

About the Guest

Scott Clark

Thanks to our sponsor Distributional

Resources

Related Topics

Related Episodes