This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Gunnar Grosch
gunnargrosch.com
did:plc:4pu2uexznbblpnl7yyjb2bvd
New post: Lightweight evals for AI agent output quality. Structural checks, LLM-as-judge, calibration loop. RISEN prompts double as your eval spec. Runnable demo with two domains.
https://dev.to/gunnargrosch/evaluating-agent-output-quality-lightweight-evals-without-a-framework-38gk
https://builder.aws.com/content/3ARUgrl7zf386B8vucsOyASQRty/evaluating-agent-output-quality-lightweight-evals-without-a-framework
2026-03-03T16:49:05.914Z