This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Michael Kirchhof (ICML)
mkirchhof.bsky.social
did:plc:wziokjxwnyunrpmoptqxveas
We put SelfReflect to the test (plus a whole landscape of other metrics we’ve explored). Our final metric is able to distinguish even almost-good from good summaries and pick up implicit distributional statements. We find it aligns with human judgements. So let’s use it: 🧵6/9
2025-07-03T09:08:56.647Z