
Coral Protocol, a decentralized infrastructure protocol for artificial intelligence developments, has published the results of independent benchmarking for its flagship AI mini-model. The product achieved an unprecedented 34% on the GAIA Benchmark, outlining the power of small-size AI models.
Coral Protocol's AI mini-model outperforms all competitors in GAIA test
According to an official statement by Coral Protocol, a decentralized AI infrastructure platform, its newest mini-model achieved previously unseen results on the GAIA benchmark. Outperforming major rivals, including Microsoft-backed Magentic-UI, Coral Protocol's AI model hit 34%.
Coral achieved the highest score on the GAIA Benchmark for verified systems using mini agents, validating NVIDIA’s thesis that smaller models — when orchestrated intelligently — represent the industry’s future. However, the team says the result had less to do with building a powerful system than altering the way we think about scaling AI systems themselves.
An open protocol, Coral is designed to push AI beyond its typical capacity. Rather than scaling up general models, it facilitates the scaling of intelligence by layering in focused, specialized agents from around the world. Through secure, parallel, multi-agent coordination, Coral enables any language model — large or small — to operate more effectively, delivering superior reasoning, planning and problem-solving.
Caelum Forder, Coral Protocol CTO, explains the meaning of this accomplishment for the entire AI segment and its mini-models scene in particular:
This breakthrough marks a turning point in AI infrastructure. It’s proof that horizontal scaling isn’t just possible – it’s practical, and Coral is the most effective way to do it. The Internet of Agents is now a working reality. If you are an agent developer, just Coralise it. If you are an application developer, build it better for less using our infrastructure.
A multi-layered evaluation suite for advanced AI capabilities, the GAIA Benchmark is used to determine the ability of AI systems to solve real-world tasks requiring significant time and effort for skilled humans. It takes the form of 450 nontrivial questions demanding intensive research, data analysis and reasoning. Developed to evaluate LLM agents on their ability to act as general-purpose AI assistants, GAIA is the industry standard for measuring model performance.
Milestone highlights opportunities for small models in AI
Competition between entities looking to create the most advanced agentic system has intensified, with the trend toward building larger models to handle ever more complex tasks. Coral’s results, however, fly in the face of convention and bear out the findings of a recent NVIDIA paper showing that smaller systems are sufficiently powerful — and do not sacrifice on speed, security and cost.
Coral’s GAIA Agent System used in the test is an application built on the eponymous protocol and heavily inspired by CAMEL’s OWL. It deploys specialized agents for a multitude of tasks such as answer finding, assistance, critique, image analysis, planning, problem solving, search, video processing and web browsing. Agents interface with one another using the Coral server’s MCP communication tools.
Topping the GAIA Benchmark leaderboard for small models illustrates Coral’s ability to improve the capabilities of all AI systems through graph-based architecture. In the process, it gives developers confidence they can create powerful yet lightweight agents supported by small models. Such systems are capable of working with more information, are more easily integrated into other ecosystems and benefit from better interconnectivity.