Community Evals: Beyond Black-Box Leaderboards
Details in article.
Details in article.
Google has released its February 2026 Discover core update, a broad enhancement to its systems that surface articles in Discover. Testing indicates th...
Details in article.
OpenAI has launched 'Trusted Access for Cyber,' a new framework designed to expand access to its advanced cyber capabilities while reinforcing safegua...
OpenAI Frontier is a new enterprise platform tailored for building, deploying, and managing AI agents within organizations. It offers essential featur...
Parallel has been integrated into the Vercel Agent Marketplace, providing native support for its suite of web tools and agents specifically designed f...
Vercel has enhanced its build logs with interactive links, enabling users to directly navigate to internal and external resources. This feature aims t...
Parallel's LLM-optimized web search and additional tools are now available on Vercel's AI Gateway, offering universal compatibility across various lar...
GitHub's Agent HQ now provides public preview access to Anthropic's Claude and OpenAI Codex for Copilot Pro+ and Enterprise subscribers. This integrat...
Mistral has released Voxtral Transcribe 2, an updated family of audio-to-text transcription models, including an open-weights version. This marks a si...
This piece explores effective strategies for distributing self-contained Go binary applications via PyPI, utilizing tools like 'go-to-wheel'. The meth...
LangChain published a guide on constructing sophisticated multi-agent AI systems using Deep Agents, emphasizing the effectiveness of breaking down com...