If you’re using AI for anything more than quick questions, you’ve hit this. You’re thirty minutes into a working session. You’ve explained your situation, shared your constraints, built up real momentum. The AI gets it.
Then it doesn’t. It repeats itself. It misses something you told it clearly ten minutes ago. The responses go generic – careful, hedged, less useful. You start re-explaining things and wondering what went wrong.
I run long AI sessions almost daily – building protocols, developing strategy, working through problems that take real back-and-forth. I’ve watched this happen more times than I can count. And for a long time, I didn’t know why. I just assumed I was doing something wrong, or the AI was having a bad day.
Turns out it’s neither. It’s a structural problem, and once you understand it, you can work around it.
Why This Happens
Every AI conversation has a fixed amount of working memory – a “context window.” Everything goes in there: your messages, the AI’s responses, documents you’ve shared, background instructions you can’t see.
The window has a limit. But that’s not really the problem.
The problem is that well before you reach the limit, the AI starts losing track. Research from Chroma in mid-2025 found that once roughly 40% of the window is in use, performance drops significantly. The AI attends best to what’s at the beginning and end. Stuff in the middle gets overlooked.
So your conversation technically fits. But the AI quietly stopped using all of it a while ago. That’s why you notice it getting worse – not because it ran out of space, but because its attention spread too thin.
If that’s the explanation you needed, skip ahead to What Actually Works. The next section covers what you might have already tried.
What You’ve Probably Tried
Memory features
Claude and ChatGPT both remember things about you across conversations now – your name, your role, preferences. That’s useful, but it doesn’t solve this.
Memory stores facts. It doesn’t preserve the conversation you built – the reasoning, the decisions, the nuance you worked through together. When you start a new chat, you get the facts. The working session is gone.
Projects
If you use projects (or custom GPTs, or similar), you’ve got persistent background – uploaded files, standing instructions. That gives new conversations a head start.
But a long chat inside a project degrades exactly the same way. The window still fills. The middle still gets lost. Projects solve background context. They don’t solve session state.
What Actually Works
Three things changed my workflow once I understood what was happening.
Start fresh sooner
Don’t wait for the conversation to visibly degrade. By the time you notice repetition or generic responses, the AI’s been struggling for a while. Shorter conversations with clean handoffs beat long conversations that slowly fall apart.
But that creates the obvious problem: the new chat doesn’t know what the old one knew.
Know what compaction looks like
If you’ve used Claude for long sessions, you may have seen a message: “this conversation has been compacted to free up space.” That’s the AI automatically summarising earlier parts of the conversation to make room.
It’s lossy. Details get dropped. Nuance gets flattened. You might not notice right away, but the conversation has quietly forgotten things you discussed.
When you see that message, the conversation is past its useful life. The better approach is to manage the transition yourself, before the AI does it for you.
Use context capsules
This is the technique that changed things for me.
A context capsule is a compressed summary of what matters from a conversation – key decisions, constraints, current state of the work – packaged so you can paste it into a new chat and pick up where you left off.
Here’s how it works. You’re deep in a working session – say you’re developing a pricing strategy. The chat’s getting long and you can feel the quality shifting. Before it degrades further, you ask the AI to create a context capsule: a summary in text or markdown format of everything that should carry forward.
The AI knows what this means. It produces a summary covering key discussions, decisions made, open questions, and where things stand. You review it. Ask for changes if something’s missing or irrelevant. Then copy the output.
Open a new chat. Paste the capsule. The AI reads it and you’re back to working – with the context that matters, without all the accumulated conversation that was degrading performance.
I do this routinely now. It takes two or three minutes. That replaces the twenty minutes I used to spend re-explaining things to a fresh chat, or worse, the slow frustration of pushing through a conversation that had forgotten half of what we’d discussed.
One thing worth knowing: if you’re working in a project, the AI can sometimes search older conversations for context. It may deny it at first – push harder. But even when that works, a context capsule is faster and more reliable. It sets the new chat up ready to go, instead of making the AI reconstruct context from fragments of old conversations.
The Trade-Off
This isn’t automatic. It takes deliberate effort at conversation boundaries – a few minutes to create the capsule, a moment to review it.
If you use AI for quick questions, none of this matters. There’s no context worth preserving.
But if you’re building real context over sessions – strategy, complex projects, ongoing collaboration – those few minutes are the difference between keeping what you’ve built and starting over every time.
Memory features are improving. They may eventually handle this. They don’t yet – not for the kind of work where context actually matters.

