Can we fix AI’s evaluation crisis?

As a tech reporter I often get asked questions like “Is DeepSeek actually better than ChatGPT?” or “Is the Anthropic model any good?” If I don’t feel like turning it into an hour-long seminar, I’ll usually give the diplomatic answer:…

Continue Reading