[release] 6 min · Apr 5, 2026

Cursor 3 — The Agent Operating System with an IDE Attached

Cursor 3 rebuilds its interface from scratch around parallel agent fleets. This is not an editor update — it is a workflow philosophy bet that puts Cursor 6–12 months...

Cursor 3 ↗ Apr 2, 2026

#ai-agents#cursor#developer-tools#agentic-infrastructure

Cursor shipped version 3 on April 2, 2026 — not a feature drop, not a UI refresh, but a complete ground-up rebuild of the interface around a single premise: you should be running fleets of agents, not typing at one. The editor pane you knew is still there, but it is no longer the default. The Agents Window is.

TL;DR

What: Cursor 3 rebuilds the entire interface from scratch around parallel agent orchestration
Key shift: Agents Window is now the default view — the IDE is the fallback, not the main event
Model: Composer 2 (built on Kimi K2.5) powers the Agents Window at $0.50/$2.50 per million input/output tokens — cheaper on token pricing than Claude Opus 4.6
Action: If you are on Pro or Ultra, this is a free update worth exploring today — particularly the cloud-local handoff and /worktree workflow

Cursor 3 — What Happened

The primary interface is the Agents Window — a standalone workspace built from scratch (not extended from the VS Code fork) for running many agents simultaneously across local, worktree, cloud, and remote SSH environments. You can switch back to the IDE at any time, or run both windows side by side. But the direction of travel is clear: the IDE is now a supporting actor.

Compositor 2 ships as the default model for the Agents Window. Released on March 19, 2026 — two weeks before Cursor 3 landed — it is built on Kimi K2.5, an open-source model from Moonshot AI, with custom reinforcement learning on top. Importantly, the Kimi K2.5 base model identity was not disclosed at launch. It became public on March 20, 2026, after a user found it in API request headers. Cursor has not made a habit of hiding its model stack, but the omission at launch is worth flagging — knowing what model underpins your default coding assistant is not a trivia question.

Composer 2 scores 61.3 on CursorBench (a 37% improvement over Composer 1.5), 61.7 on Terminal-Bench 2.0, and 73.7 on SWE-bench Multilingual. Priced at $0.50 per million input tokens and $2.50 per million output tokens, it undercuts Claude Opus 4.6 meaningfully on token pricing. Cursor’s own pricing is unchanged: $20/month Pro, $200/month Ultra (20× model usage, guaranteed compute), free Hobby tier for evaluation. Cursor 3 is a free update for all existing subscribers.

Why This Matters

I’ve watched Cursor iterate fast since the VS Code fork, but Cursor 3 is the first release where I believe the roadmap framing: they are not building a better editor, they are building an agent operating system that happens to have an editor attached.

The Agents Window is not a cosmetic change. It encodes a specific belief about how software will get built — not a developer directing a single agent through a conversation, but a developer orchestrating a fleet of agents running in parallel, reviewing outputs, merging results. Every architectural decision in Cursor 3 follows from that belief.

Cloud-local handoff is the infrastructure claim. Start a task locally, hand it to a cloud sandbox when you go offline or need to context-switch, pull results back when you return. Cloud agents produce demo videos and screenshots for verification — previously web-only, now in the desktop app. This is the feature that separates Cursor 3 from every AI editor that runs agents only as long as your laptop lid is open.

Design Mode is the UX claim. Toggle it with ⌘+Shift+D, use Shift+drag to select a browser-rendered UI element, and the agent targets that specific component without you writing a text description of it. For frontend work, this changes the interaction model from “describe what you want changed” to “point at it.” The difference in iteration speed is not marginal.

The /worktree and /best-of-n commands are the workflow claim. /worktree creates an isolated git worktree so agent changes happen in a branch, not in your working tree. /best-of-n runs the same prompt across multiple models simultaneously and surfaces results side by side. Neither of these features exists natively in GitHub Copilot or Claude Code. Copilot runs agents sequentially inside the editor. Claude Code operates as a CLI tool with no native parallel execution. Cursor 3 is shipping UX that neither competitor has on their roadmap in any announced form — which puts it 6–12 months ahead on agent orchestration specifically.

The /best-of-n command is referenced in secondary sources and the Cursor changelog, but was not explicitly detailed in the official Cursor 3 blog post at launch. Treat it as a shipped feature, but verify behavior in your own environment before building workflows around it.

The Composer 2 pricing is also a deliberate positioning move. At $0.50/$2.50 per million tokens, it is substantially cheaper than the frontier Anthropic models Cursor previously leaned on. Cursor is not just building an agent interface — it is building a vertically integrated stack where it controls the model, the orchestration layer, and the pricing. That is a different competitive moat than “we have the best IDE.”

If you are evaluating Cursor 3’s agent quality independent of the Agents Window UX, run the same prompt through Composer 2 and Claude Sonnet in parallel using /best-of-n. The benchmark numbers favor Composer 2 on multilingual codebases; Sonnet still edges it on reasoning-heavy tasks. The comparison is worth doing on your own codebase before committing to Composer 2 as default.

The comparison that matters most is not Cursor vs. Copilot on autocomplete — that race ended 18 months ago. It is Cursor vs. Claude Code on the question of what the primary interface for AI-assisted development looks like. Claude Code positions the terminal as the interface; Cursor 3 positions the Agents Window. These are not compatible philosophies. Claude Code users tend to be engineers who want control at the task level and distrust visual abstractions. Cursor 3 is betting on engineers who want throughput and are willing to trade granular control for it.

That bet is not obviously right. But it is a coherent one.

The Take

The real question Cursor 3 forces is not whether the features work — they do. It is whether senior engineers want to orchestrate agents or still prefer to direct them one at a time.

Orchestration means you define the task, set the constraints, and review outputs. Direction means you are in the loop on every decision. Most developers I know still want direction — they want to know what the agent is doing before it does it, not after it produces a diff. Cursor 3 is built for orchestration. The Agents Window, the cloud handoff, the parallel model comparison — all of it optimizes for throughput, not for granular oversight.

That workflow split is real, and Cursor is picking a side. For indie builders shipping fast with small teams, the orchestration model is genuinely compelling — run five agents in parallel, review outputs, merge the best. For tech leads responsible for code quality on a 15-person team, the calculus is different: more throughput also means more review surface, and agentic diffs are not always easy to audit.

Cursor 3 does not remove the single-agent workflow. The IDE pane and standard Chat/Composer remain accessible. The shift is in default framing, not in what is technically possible. You can run Cursor 3 exactly like Cursor 2 if you want to.

My read: Cursor 3 is the right product for the direction AI-assisted development is heading, and it is shipping features that GitHub Copilot and Claude Code have not matched. The cloud-local handoff alone is worth updating for. But the orchestration model works best when agents are reliable enough that you do not need to watch every step — and we are not there yet. Use the Agents Window for parallelizable, bounded tasks. Keep the IDE open for anything that requires judgment at each decision point.

Cursor just pulled ahead. The question is whether the rest of the ecosystem catches up before the orchestration model becomes table stakes — or whether Cursor has enough of a moat by then that it does not matter.

By dennis · Apr 5, 2026 ← all signals