We evaluate DeepCode on the PaperBench benchmark (released by OpenAI), a rigorous testbed requiring AI agents to independently reproduce 20 ICML 2024 papers from scratch. The benchmark comprises 8,316 ...
On Tuesday, French AI startup Mistral AI released Devstral 2, a 123 billion parameter open-weights coding model designed to work as part of an autonomous software engineering agent. The model achieves ...
A maximum-severity security flaw has been disclosed in React Server Components (RSC) that, if successfully exploited, could result in remote code execution. The vulnerability, tracked as ...
OpenAI has introduced GPT‑5.1-Codex-Max, a new frontier agentic coding model now available in its Codex developer environment. The release marks a significant step forward in AI-assisted software ...
Cursor’s new Composer model, built for low-latency agentic coding, completes most iterations in under 30 seconds, according to Anysphere. Anysphere has introduced Cursor 2.0, an update to the AI ...
The vibe coding tool Cursor, from startup Anysphere, has introduced Composer, its first in-house, proprietary coding large language model (LLM) as part of its Cursor 2.0 platform update. Composer is ...
Diligent Robotics, which deploys mobile manipulation robots in hospitals, today unveiled plans for Moxi 2.0, the latest generation of its platform. The company said the launch builds on three years of ...
What if your code could think for itself, anticipating your next move, debugging with precision, and even automating entire workflows? With the release of Claude Code 2.0, this isn’t just a futuristic ...
In this paper, we propose NeRD (Neural Robot Dynamics), learned robot-specific dynamics models for predicting future states for articulated rigid bodies under contact constraints. NeRD uniquely ...
Gold-colored iteration of its Optimus humanoid robot.Tesla Optimus/Marc Benioff Tesla has unveiled a gold-colored iteration of its Optimus humanoid robot and clarified that the machine is an ...