red yellow and purple abstract painting

Field Notes · Techniques

You give the agent a clear rule — say, keep every reply under 200 words. At first it holds. Then ten turns later it’s handing you essays, like you never said a word. It feels like the model forgot. It didn’t. It was never remembering in the first place.

The model is reading, not remembering

A language model doesn’t carry your conversation around in its head between turns. Every time it replies, it reads the entire transcript back from the top and predicts what comes next. There’s no place where “the rule you set” gets stored and held — there’s just the text in front of it, read fresh each time.

So when people say the AI forgot, what actually happened is more mundane. The instruction was still technically sitting in the transcript. By the time the model reached your latest message, it had simply stopped carrying any weight.

Why it gets worse the longer you go

Two things stack up against you as a session runs long.

First, the context window. Every model can only read so much text at once. As your conversation grows, older messages get pushed toward the edge of what it can see — and eventually fall out of it entirely. The rule you set on turn one might literally not be visible anymore by turn forty.

Second, even when it is still visible, it gets crowded out. Models lean hardest on what’s most recent and most repeated. One instruction, stated once, at the very top, is competing with everything that came after it — your follow-ups, the model’s own long answers, every tangent in between. The signal is still in there. It’s just buried.

That’s the real shape of the problem. It isn’t a memory failure, it’s a salience failure. The instruction didn’t disappear. It stopped standing out.

The fix that costs nothing

Re-anchor. Restate the rule near the point where it actually matters, instead of trusting that one mention at the top will carry the whole way.

In practice it’s almost embarrassingly simple. Right before the part of the task where the rule applies, say it again: “Reminder: keep this under 200 words.” “Before you write any code — we do not touch the production database.” You’re not apologizing for repeating yourself. You’re putting the signal back on top, where the model is actually looking.

A stronger version: keep a short running state block and paste it at the start of each turn —

GOAL: plan a 3-day trip to Lisbon
RULES: budget under $150/day, no early mornings
WHERE WE ARE: picking restaurants for day two

It takes ten seconds and it resets the model’s attention every single turn, instead of hoping turn one still echoes. Either way the principle is the same: don’t rely on the model to hold context. Hold it yourself, and feed it back in at the moment it’s needed.

Where this stops being enough

This works great for a single chat you’re babysitting. It starts to break the moment you’re not.

Once you’ve got an agent running multiple steps on its own, or several agents handing work to each other, or a task that runs long enough that you can’t sit there re-pasting state by hand — manual re-anchoring doesn’t scale. The context problem doesn’t go away; it just gets too big to patch one message at a time. At that point you’re not managing a conversation anymore, you’re managing a system, and that needs structure built for it, not a habit.

But that’s a different note. For now: most broken AI sessions aren’t broken because the model is dumb or the prompt was wrong. They’re broken because the important thing slid out of view. Put it back. Say it again, right where it counts.

Leave a Reply

Discover more from BINDLCORP

Subscribe now to keep reading and get access to the full archive.

Continue reading