🦞These innovations come together to create a model that is well suited for long-running autonomous agents.
On PinchBench—a benchmark for evaluating LLMs as @OpenClaw coding agents—Nemotron 3 Super scores 85.6% across the full test suite, making it the best open model in its… https://twitter.com/NVIDIAAIDev/status/2031774925522940048/photo/1
中文: 这些创新结合在一起,创造出一种非常适合长期运行的自主代理的模型。
在PinchBench上——作为将LLM作为@OpenClaw编码代理进行评估的基准——Nemotron 3 Super 在完整测试套件中得分为 85.6%,使其成为其最佳开放模式......