HN Story - macintosh.world

macintosh.world | Log In | Register

Back to HN

GLM 5.2 Performance Benchmarks

by theanonymousone | 53 points | 12 comments | 2026-06-17 02:30:51 Central

Open Source Link | Read Source Here

Open on Hacker News

Comments

wongarsu
It does really well on "AA-Omniscience Non-Hallucination Rate", far higher than DeepSeek, GPT 5.5 or Fable. I really like that benchmark because it's one of the few benchmarks that allows LLMs to elect not to answer if they are unsure and punishes them for trying to bullshit their way through the benchmark

> andai
This implies that other benchmarks (for which every AI provider is optimizing?) are actively encouraging bullshitting?

hemkeshr
Local models are already useful today. The next milestone is getting this level of performance onto truly affordable hardware.

XCSme
I also tested it[0]: quite similar to GLM 5, a few percent better, 30% faster and 50% more expensive.[0]: https://aibenchy.com/?q=glm

> XCSme
PS: Just added a cool feature, so you can filter the leaderboard for multiple models at once, by using a comma, like: https://aibenchy.com/?q=glm,claude

> lousken
still 1/4 of the price of anthropic and openai models though

lanycrost
It's always nice to see how open source models growing, hope we will have good performance with lower tier hardware some day.

theturtletalks
I want to trust their benchmarks but when they have Muse Spark over GPT-5.5, it gives me pause.

sourcecodeplz
still quite verbose at 140m output tokens, but this is on max thinking. high should do better.

ChrisArchitect
Some more discussion: https://news.ycombinator.com/item?id=48567759

DeathArrow
One or two more releases and they will reach Fable level.

> vitalyan123
by then there will be Fable 5.21, again 5% ahead of every other SotA while still only 500% the size.