[release] 5 min · May 7, 2026

Claude Dreaming — Agents That Rewrite Their Own Prompts

Anthropic shipped dreaming for Claude Managed Agents — a scheduled process where agents review past work and rewrite their own memory files. Here is why that matters.

#ai-agents#anthropic#managed-agents#agent-memory#platform-lock-in

Anthropic announced “dreaming” for Claude Managed Agents at the Code with Claude event on May 6. It is a scheduled, between-session process where agents review their own past runs, identify recurring patterns and mistakes, and rewrite the memory files they operate on — either autonomously or with human approval before commit. This is the first production feature from a major AI lab where agents can modify their own instructions without a developer touching a prompt.

TL;DR

  • What: Claude Managed Agents can now run scheduled “dreaming” processes that review past sessions and rewrite their own memory files
  • Lock-in: Dreaming is exclusive to Claude Managed Agents (research preview, access by request) — not available via the Messages API, Claude Code, or any self-hosted setup
  • Evidence: Harvey’s legal pilot saw completion rates rise approximately 6x after enabling dreaming for file-format and tool-pattern recall
  • Risk: Research preview status means breaking changes with only one week’s notice — not appropriate for critical production workflows yet

What Happened

Until now, agent memory in Claude Managed Agents worked within tasks and sessions — the agent could recall context during a conversation, persist information across turns, and reference stored facts. Dreaming adds the missing layer: processing and restructuring that memory between sessions. On a schedule you configure, the agent reviews all recent runs across multiple agents and sessions, surfaces recurring errors, identifies converging workflows and cross-agent preferences, and decides which stored information has become irrelevant or contradictory.

The critical detail is the control model. Anthropic built two modes: fully automated (the agent rewrites memory files and they take effect on the next run) and human-in-the-loop (the agent proposes changes, a developer reviews and approves before they commit). The architecture works across agents, not just within a single thread — if you have five agents handling different parts of a workflow, dreaming can identify patterns that span all five.

# Conceptual dreaming configuration
dreaming:
  schedule: "0 3 * * *"        # Run nightly at 3 AM
  scope: "all-agents"          # Cross-agent memory analysis
  approval: "manual"           # Options: automatic | manual
  retention: "90d"             # How far back to analyze

The companion “outcomes” feature, now in public beta alongside multi-agent orchestration and webhooks, provides the evaluation layer. In Anthropic’s internal benchmarks, outcomes improved task success by up to 10 percentage points over standard prompting loops, with file generation quality increasing 8.4% for docx and 10.1% for pptx files. Outcomes gives agents a success criterion; dreaming gives them the ability to learn from failure against that criterion.

Dreaming is in research preview. Anthropic explicitly warns it may ship breaking changes during this window, with at least one week’s notice to migrate. Do not deploy this for sensitive or critical production workflows in its current state.

Why This Matters

This is not a quality-of-life feature. It is a fundamentally new category of agent behavior, and the implications run in two directions — one exciting, one concerning.

The capability argument is real. Long-running agent deployments have always hit the same wall: agents make the same mistakes across sessions because each session starts from a static prompt. Human operators notice patterns and manually update system prompts — “always use PDF format for this client,” “the API returns dates in UTC, not local time,” “when generating pptx files, use the template from drive, not the default.” This manual prompt maintenance is one of the biggest hidden costs in production agent systems. Dreaming automates exactly that loop. Harvey’s legal pilot is the clearest proof point: agents handling file-format workarounds and tool-specific patterns saw completion rates rise approximately 6x after dreaming was enabled. That is not a marginal improvement. That is the difference between a system that requires constant human babysitting and one that converges on correct behavior over time.

The lock-in argument is equally real. Dreaming is available only inside Claude Managed Agents — not via the Messages API, not in Claude Code, not anywhere you can self-host or wrap in your own infrastructure. Outcomes, multi-agent orchestration, and webhooks share the same constraint: public beta, but only within Anthropic’s managed runtime. If your agents need to remember and improve, Anthropic now owns that loop. You cannot take your dreaming-optimized memory files and run them on a different model. You cannot replicate the between-session analysis process outside their platform. The value accumulates inside their system, and switching costs grow with every dreaming cycle.

This is the same pattern we flagged when Managed Agents launched — but dreaming makes the gravity well deeper. Memory that persists within sessions is portable (you can export it, restructure it, replay it). Memory that an agent has autonomously restructured based on cross-session pattern analysis is not portable in any practical sense, because the restructuring logic lives inside Anthropic’s infrastructure and you have no visibility into the intermediate reasoning.

If you enable dreaming, start with manual approval mode. Review every proposed memory change for the first 2-3 weeks to build intuition about what the agent considers a “pattern.” Automated mode is a privilege you grant after you understand the agent’s judgment.

The comparison to existing approaches matters here. Frameworks like LangGraph or custom orchestration layers give you agent memory with full control — you own the persistence layer, you write the retrieval logic, you decide what gets updated and when. The tradeoff is obvious: you build and maintain all of it yourself. Dreaming is Anthropic saying “we will handle that for you” — and the price is that the improvement loop becomes opaque and non-portable. For teams that want rapid deployment without infrastructure investment, that is a reasonable trade. For teams building differentiated agent systems where the memory is the moat, handing that loop to a platform vendor is a strategic mistake.

The Take

I’d treat dreaming like a database migration that runs automatically at 3 AM: genuinely useful in the right hands, catastrophic if you have not thought through who owns the resulting state. The Harvey numbers are impressive enough that dismissing this feature would be intellectually dishonest — 6x completion rate improvement means the agents are meaningfully better at their jobs after dreaming cycles. But the architecture of this feature tells you exactly where Anthropic is headed: they want to own the full agent lifecycle, from deployment to evaluation to self-improvement, inside a managed runtime you cannot leave without losing accumulated intelligence.

My recommendation splits cleanly by team type. If you are building internal tools where the agent’s job is well-defined and switching models is not a strategic concern, enable dreaming in manual approval mode and measure the impact over 30 days. The productivity gain is likely real. If you are building a product where agent intelligence is your competitive advantage, do not let a vendor own your improvement loop. Build your own between-session analysis — it is not that hard once you have structured logs — and keep the accumulated knowledge in infrastructure you control.

The real tell is what Anthropic did not ship: they did not expose the dreaming process as an API you can call on your own memory stores. They shipped it as a managed service feature. That is not an oversight. That is a business model.