As LLMs evolve into multi-step agents, evaluation must move beyond input-output correctness. This session shows how to assess both node-level decisions and full agentic flows, with a live demo of multi-node agents and practical eval techniques.
Hilik Paz is the Co-Founder and CTO of arato.ai, a platform dedicated to enabling teams to build, evaluate, and deploy reliable GenAI agents. With a strong foundation in software engineering and a keen focus on applied AI, Hilik has been instrumental in developing tools that bridge the gap between LLM capabilities and real-world applications.