Synthesis Coding · Synthesis Engineering →

How Claude's Memory Actually Works (And Why CLAUDE.md Matters)

Claude and ChatGPT handle memory differently. Understanding those tradeoffs explains why CLAUDE.md files work — and why some context management strategies fail.

Working with Claude Code over the past months, I’ve developed the practice of using CLAUDE.md files to maintain project context across sessions. The practice works, but I didn’t fully understand why until I started digging into how Claude’s memory architecture actually functions. Two recent analyses — Manthan Gupta’s breakdown of Claude’s memory system and Decode Claude’s look at session memory — filled in the picture.

Two philosophies of AI memory

ChatGPT and Claude take fundamentally different approaches to remembering context.

ChatGPT pre-computes summaries of what it knows about you and injects those summaries into every conversation. The memory is always present, automatically. You don’t have to ask for it or remember that it exists. The tradeoff is depth — summaries are compressed, and nuance gets lost.

Claude takes the opposite approach. It has memory tools available, but uses them on demand rather than automatically. You can ask Claude to search past conversations or retrieve recent chats, and it will — but it has to decide to invoke those tools. If it doesn’t think to look, relevant context stays buried.

The architectural difference matters: Claude sacrifices automatic continuity for on-demand depth. ChatGPT sacrifices depth for automatic continuity.

Neither approach is universally better. They optimize for different use cases. But the distinction explains why certain practices work with one system and not the other.

Claude’s four-layer context structure

When Claude processes your message, it sees information organized in layers:

Layer 0: System prompt. Static instructions, tool definitions, safety constraints. This is set by the application you’re using — Claude.ai, Claude Code, or whatever interface you’re working in. You don’t control this directly.

Layer 1: User memories. Distilled facts about you, stored in XML format. Things like your name, preferences, past work mentioned in conversations. These are derived from your history but abstracted into compact facts.

Layer 2: Conversation history. The rolling window of recent messages. Claude Code maintains around 190,000 tokens of context here. As conversations grow, older content gets compacted or dropped to make room for new exchanges.

Layer 3: Current message. What you just said.

The layers interact. Claude sees all of them when processing your request, but the reliability of each layer differs. Layer 0 is always present and accurate — it’s literally the instructions. Layer 3 is what you just typed — also accurate. Layers 1 and 2 are where things get interesting.

The retrieval problem

Claude’s memory tools aren’t automatic. It has two primary ways to access past context:

Conversation search — searches past conversations by topic or keyword, returning up to five results.

Recent chats — retrieves conversations by time, letting Claude see what you discussed recently.

The catch: Claude has to decide to use these tools. If your current question doesn’t seem to require historical context, Claude might not look. And if it doesn’t look, it won’t know what it doesn’t know.

This explains a frustrating pattern I’ve observed. You have a long-running project with Claude. You’ve discussed architecture decisions, established patterns, made commitments about how things should work. Then in a new session, Claude seems to have forgotten everything. It wasn’t being forgetful — it just didn’t retrieve the relevant context because it didn’t know it needed to.

Why CLAUDE.md files solve this

CLAUDE.md files sidestep the retrieval problem entirely. They’re not memories that Claude might or might not look up. They’re files in your project directory that Claude reads at the start of every session.

When Claude Code starts up, it reads CLAUDE.md files automatically. That content goes directly into the context window, every time. No selective retrieval. No hoping Claude remembers to search. The information is just there.

This is a fundamentally different model. Instead of trusting Claude’s judgment about what context to retrieve, you explicitly declare what context matters. The file is always present because you’ve put it in the project, not because Claude decided to fetch it.

I maintain CLAUDE.md files with several categories of information:

Project-specific context — what this codebase does, its architecture patterns, key decisions already made.

Working agreements — how I want Claude to approach problems in this project, what patterns to follow, what to avoid.

Current state — what I’m working on, what’s in progress, what’s blocked.

All of this would theoretically be available in conversation history. But conversation history gets compacted, requires retrieval decisions, and ages out over time. CLAUDE.md files persist.

Session memory — where it’s heading

Claude Code is developing explicit session memory capabilities. The pattern emerging looks like this: after significant work in a session, Claude extracts key information into structured notes stored in markdown files. These notes follow a template with sections for current state, task specifications, relevant files, workflow patterns, errors encountered, and key learnings.

The storage location: ~/.claude/session-memory/[session-id].md

The timing isn’t arbitrary. Extraction triggers after around 10,000 tokens, updates every 5,000 tokens after that, and also fires every three tool calls. The system tries to capture meaningful work without constantly interrupting.

What makes this interesting is the format choice. Plain markdown files, stored on your local disk, editable by you. Not a black box database. Not a proprietary format. Just text files you can read, modify, and delete.

The quote that captures this philosophy: “Read markdown, write markdown. The hack that enables infinite context.”

Implications for synthesis coding

Understanding Claude’s memory architecture changes how I approach projects.

Front-load context in files, not conversation. Information in CLAUDE.md is more reliable than information discussed three conversations ago. If something matters across sessions, write it down.

Treat CONTEXT.md as system state. I maintain project state in CONTEXT.md files that get updated as work progresses. Current phase, recent decisions, active blockers, next steps. Claude reads this at session start and updates it during work. The file becomes the source of truth.

Don’t trust retrieval for critical context. If Claude must know something to work correctly, make sure it’s in a file that gets loaded automatically. Don’t assume it will search conversation history.

Design for session boundaries. Sessions end. Context compacts. Work spans multiple sessions. Build workflows that survive session transitions rather than assuming continuous memory.

This matches what the OpenAI Sora team discovered with their AGENT.md files. Persistent context in files beats hoping the AI will remember what you discussed. Different tools, same pattern.

The mental model shift

The instinct when working with AI is to treat it as a single continuous entity that knows what you’ve told it. That model breaks across sessions, across context windows, across retrieval decisions. Claude today doesn’t automatically know what you told Claude yesterday.

A better mental model: each session is a new colleague who has access to some notes about past work. The quality of those notes determines how much context survives. If you want context to persist, write it down somewhere that gets read automatically.

CLAUDE.md files are those notes. Session memory is automation for generating those notes. Both patterns solve the same underlying problem — bridging the gap between conversations that feel continuous and an architecture that treats each context window as potentially independent.

Practical takeaways

If you’re building workflows with Claude:

  1. Create a CLAUDE.md in every project directory. Even a brief file with project overview and key patterns beats nothing.

  2. Maintain a CONTEXT.md for active work. Update it when state changes. Read it at session start. Let it survive across sessions.

  3. Be explicit about what matters. Don’t assume Claude remembers. State important context, even if you’ve mentioned it before.

  4. Design for reliability, not intelligence. Yes, Claude has memory tools. But tools that must be invoked are less reliable than files that are always loaded.

  5. Check session memory if you’re on Claude Code. The ~/.claude/session-memory/ directory shows what Claude has extracted from past sessions. You can edit those files.

The architecture isn’t magic. It’s engineering with tradeoffs. Understanding those tradeoffs helps you work with the system rather than against it.


This article is part of the synthesis coding series. For related content on managing AI context, see Synthesis Coding with Claude Code: Technical Implementation.


Rajiv Pant is President of Flatiron Software and Snapshot AI, where he leads organizational growth and AI innovation. He is former Chief Product & Technology Officer at The Wall Street Journal, The New York Times, and Hearst Magazines. Earlier in his career, he headed technology for Condé Nast’s brands including Reddit. Rajiv coined the terms “ synthesis engineering” and “ synthesis coding” to describe the systematic integration of human expertise with AI capabilities in professional software development. Connect with him on LinkedIn or read more at rajiv.com.

Originally published on rajiv.com