Why Does Your AI Coding Assistant Forget Your Codebase Every Session?
95% of developers use AI coding tools weekly. Copilot advertises 400K tokens but caps usable context at 128K. Cursor starts strong, then forgets what it wrote five minutes ago. Every session starts from scratch.
AI coding assistants are session-based. They understand your code within the current window but carry nothing forward. Re-explaining architecture decisions, coding patterns, and project context every session is the norm. The models aren't bad. Nothing underneath them persists structured project state across sessions. A bigger context window won't fix that. A persistence layer will.
You open your editor. You start a new session with Copilot, Cursor, or Claude Code. You type: "Continue working on the auth refactor from yesterday."
The assistant has no idea what you're talking about. It doesn't know about the refactor. It doesn't know your auth architecture. It doesn't know that you moved from JWT to session tokens last week, or that the tests in /payments broke because of the change, or that you prefer TypeScript strict mode.
So you re-explain. Again. Every single session.
95% of developers use AI tools at least weekly, and 51% use them daily. Every one of them re-explains their codebase from scratch every time they start a new conversation.
What's Actually Happening With Copilot's Context Window?
GitHub Copilot's context window is the most documented example of the gap between advertised and usable capacity.
The API reports context_window values up to 400K tokens, but max_prompt_tokens is capped at 128K. That means 68% of the advertised window is inaccessible for input. The rest is reserved for output generation, internal reasoning, and safety scaffolding.
On top of that, up to 40% of the usable window is labeled "Reserved Output," even with minimal prompts. Developers report that with models like Opus 4.6, the reserved space is consumed by hidden reasoning tokens before visible output is produced.
The result: developers hit compaction frequently. The assistant compresses earlier context to make room for new input. The architecture decisions you explained at the start of the session get summarized into a lossy paragraph. By message 20, the assistant is working from a degraded version of what you told it.
Copilot's acceptance rate sits at 35-40%. The biggest frustration, cited by 66% of developers: dealing with "AI solutions that are almost right, but not quite."
Why Does Cursor Forget What It Just Wrote?
Cursor advertises a 200K token context window. In practice, users report degraded understanding at 70-90% utilization.
The specific complaints from the developer community:
- Cursor makes mistakes around orchestration. "The AI will just straight forget what it's doing." It starts strong, then suggests changes that conflict with code it wrote minutes earlier. Output is inconsistent across sessions, partly because Cursor switches models behind the scenes.
- As codebases grow, logic breaks and functions stop working because earlier architectural context fell out of the window
These aren't bugs. They're symptoms of the same architectural problem: the assistant's understanding of your project exists only within the current context window. When that window fills up, earlier understanding gets compressed or dropped.
Why Can't a Bigger Context Window Fix This?
The instinct is always "make it bigger." Context windows have grown from 4K to 32K to 128K to 200K to 1M tokens. The problem persists for three reasons:
-
Performance degrades with length. Models show a U-shaped attention curve. They attend to the beginning and end of context but lose track of information in the middle. A 200K window doesn't mean 200K of equally useful context.
-
Cost scales linearly. Every token in the context window costs inference compute. A developer working for 8 hours generates far more than 200K tokens of meaningful project context. You can't economically keep everything in the window.
-
The window still resets. Close the tab, start a new session, switch branches, and everything in the window is gone. A bigger window makes individual sessions longer. It doesn't solve the cross-session problem.
No structured representation of your project state exists outside the window. That's the real issue.
What Would an AI Coding Assistant With Continuity Look Like?
You open your editor. Before you type anything:
The assistant already knows:
- You're working on a Next.js app with TypeScript strict mode
- Yesterday you refactored the auth module from JWT to session tokens
- Tests in
/payments/checkout.test.tsare failing because they still reference the old JWT validation - You prefer named exports, and your team uses Tailwind with a custom design system
- The PR you're working on is
feature/session-auth, branched frommainat commita3f2b1c
The assistant didn't search your files. A layer underneath has been maintaining structured traces of your project's evolving state: decisions made, patterns established, what changed and when, what's broken and why.
| Current coding assistants | Coding assistants with continuity | |
|---|---|---|
| New session | Re-explain everything | Picks up where yesterday left off |
| Architecture decisions | Forgotten after compaction | Persisted as structured traces |
| After a refactor | Suggests old patterns that conflict | Knows what changed and adapts |
| Cross-file context | Limited to what fits in window | Maintains project-wide state |
| When you switch branches | Loses all context | Reconstructs branch-specific state |
Why Aren't Coding Tool Companies Building This?
They're building in the other direction: bigger windows, better retrieval, smarter indexing. These are all read-path improvements, better ways to pull relevant code into the context window at query time.
The missing piece is the write path. When you make an architecture decision, refactor a module, or establish a pattern, that understanding should be decomposed and stored in structured form. It should persist across sessions, across branches, across tools.
Current tools index your codebase for retrieval. They don't maintain a structured model of your project's evolving state. The code is the source of truth for what exists. But the why, the when, the what changed, the what's currently broken: none of that is captured anywhere.
This is the same missing layer that affects every AI vertical. Companions forget your story. Chatbots make you repeat yourself. Agents can't maintain state. RAG retrieves the wrong chunks. Same architecture gap, different surface.
What I Built
At Kenotic Labs, I built a write-path-first deterministic architecture called DTCM (Decomposed Trace Convergence Memory). Every interaction is decomposed into structured traces at write time. At read time, the system reconstructs situational context deterministically, not probabilistically.
I tested it against ATANT, the first open evaluation framework for AI continuity. 250 narrative stories. 1,835 verification questions. 100% accuracy in isolated mode. 96% at 250-story cumulative scale.
Your coding assistant shouldn't need you to re-explain your codebase every morning. This is an infrastructure problem, not a model problem.
Follow the research at kenoticlabs.com
Samuel Tanguturi is the founder of Kenotic Labs, building the continuity layer for AI systems. ATANT v1.0, the first open evaluation framework for AI continuity, is available on GitHub.