IK Workshop: LLM Output Evaluation + Networking
About this Event
Building AI applications has never been easier.
Evaluating them has never been harder.
As teams race to ship AI-powered products, a critical challenge continues to emerge: How do you know if your LLM is actually performing well?
Traditional software testing was built for deterministic systems. LLMs introduce ambiguity, subjectivity, and constantly evolving behavior—making evaluation one of the most important (and misunderstood) challenges in modern AI development.
This practitioner-led workshop explores the emerging discipline of LLM evaluation: how leading teams measure quality, where automated evaluation succeeds and fails, and the common pitfalls that can undermine even well-designed AI systems.
PLUS: Network with fellow Seattle-area engineers, builders, and AI enthusiasts over food and refreshments.
What You'll Learn & Do
- Evaluation FundamentalsWhy traditional testing approaches break down when applied to generative AI systems.
- LLM-as-a-JudgeHow teams use LLMs to evaluate other LLMs—and where this approach can produce misleading results.
- Failure Modes in ProductionCommon evaluation mistakes and blind spots that organizations often discover only after deployment.
- Hands-On Evaluation ExerciseStep into the role of the evaluator. Review outputs, make judgment calls, and compare your decisions with those generated by AI evaluators.
- Designing Better Evaluation SystemsPractical approaches for creating evaluation frameworks that balance speed, quality, and reliability.
- Open Discussion & Q&ABring your questions, challenges, and experiences to an interactive discussion with fellow practitioners.
Workshop Format
30% Industry Insights
Evaluation strategies, emerging trends, and lessons from real-world AI deployments.
40% Interactive Learning
Guided exercises designed to help participants think like evaluators.
30% Discussion & Q&A
Collaborative conversations around challenges, trade-offs, and practical implementation.
Who Should Attend?
- Software Engineers building AI-powered products
- Product Managers working with LLM-based features
- AI Engineers and ML Practitioners
- Technical Program Managers and Engineering Leaders
- Anyone responsible for evaluating AI system quality and reliability
No ML background required. Familiarity with software systems and an interest in AI applications is helpful.
Where is it happening?
Event Location & Nearby Stays:
USD 16.58 to USD 55.28


















