Cursor agent mode goes GA — what it does, what it doesn't

Cursor moved agent mode out of preview this week. The feature has been in beta for about three months. After two days running it on three production codebases — one TypeScript monorepo, one Python ML stack, one mixed-language web app — here's the practical read.

What it actually does

Cursor's agent mode lets you give the editor a multi-step task and walk away. It plans, makes edits across files, runs the test suite, reads the failures, and iterates until tests pass or it gives up. Think of it as "Claude Code, but inside Cursor."

The new version added three things worth knowing about:

Persistent memory across sessions. The agent now remembers what it changed last time within the same project, so a long task interrupted on Friday picks up cleanly on Monday. This was the biggest gap in the preview.

Approval gates per file. You can require human approval before any edit lands in a particular file or directory. Useful for keeping it out of CI configs, secrets handling, and migration files.

Better tool-use logging. Every command it runs is now visible in a side panel with the output, so you can debug why it took a wrong turn without trawling through conversation.

What works

Three workflows we tested where it produced clean results without hand-holding:

"Add a new endpoint with full tests." From a one-line prompt, it scaffolded the route, the handler, the validation, the test file, ran the suite, and fixed two unrelated test failures it noticed along the way. ~14 minutes start to finish on a Node/TypeScript stack.
"Refactor this directory to use the new auth helper." 22 files, 80+ edits, all tests passing, no manual fixups. Saved roughly 90 minutes of busywork.
"Find every place where we still use the old logger and migrate." Did the search, did the migration, didn't break the test suite, opened a clean PR equivalent.

What still fails

Three patterns where it's measurably worse than a competent human:

Cross-cutting refactors that touch the build system. It will quite happily modify your package.json, tsconfig.json, or vite.config.ts in ways that break things downstream. Use the per-file approval gate for these.
Tasks that require reading external API docs in detail. It approximates well enough for popular APIs but hallucinates field names on smaller third-party services.
Anything that involves database migrations. Just don't.

How it compares to Claude Code

Practical differences after running both daily for two weeks:

Dimension	Cursor agent mode	Claude Code
Where it lives	Inside the editor	CLI, separate window
Speed on small tasks	Faster (model is local-context aware)	Slower (loads context per session)
Speed on long multi-file tasks	About equal	About equal
Quality on coding	Slightly behind	Slightly ahead
Quality on planning	Behind	Ahead
Cost	$20–$40/month per seat	Pay-per-token, unpredictable

The honest answer: if you live in your editor, Cursor's integration wins on friction. If you do heavy planning before you write code, Claude Code wins on the planning quality. Many of the engineers we know run both.

What we'd do

Turn it on for a week. Use it for the boring 70% of your work — adding endpoints, writing tests, doing migrations. Approve edits one file at a time until you trust it on a given codebase, then loosen the gates.

Don't replace git diff review. Whatever it ships, you sign for.

Frequently asked questions

Is Cursor agent mode free?

No. It's bundled with the Cursor Pro plan ($20/month) and Business plan ($40/seat/month). The free tier doesn't include agent mode at GA.

How does Cursor agent mode compare to Claude Code?

They're roughly equivalent on coding output. Cursor wins on in-editor friction (it lives where you already are). Claude Code wins on planning quality before any code is written. Many engineers run both.

Can it touch package.json or build configs safely?

Not without per-file approval gates. It will modify build configs in ways that break things downstream. Lock those files and require human approval before edits land.

Does it run my test suite?

Yes. The agent runs tests, reads failures, and iterates until they pass or it gives up. The new tool-use logging panel shows each command and output for debugging.

Sources

1.Cursor — Cursor agent mode is now generally available · Apr 23, 2026
2.Cursor changelog — April 2026 release notes · Apr 23, 2026