The cutoff date

In the previous step I left the model's knobs still: they're set once, during training, and then they no longer move. From that came a detail I only flagged then and now develop: the model knows about the world up to a specific day and not a minute more. Understanding that saved me a lot of grief, and I want you to be clear on it before you trust anything it tells you about yesterday.

The world stopped on a date

It has a name and it's worth learning: the cutoff date, or knowledge cutoff. It's the last moment the model learned anything from. During training it read a staggering amount of text, yes, but that text was collected up to a certain day. What happened after didn't go in. For the model, as far as its memory goes, the world stopped there.

It took me a while to accept it because the chat doesn't behave as if it were frozen. It talks to you fluently, in the present, as if it were on top of everything. But that fluency is the fluency of language, not of current events. Underneath there's knowledge that has a date, and the date is old almost always, because preparing and training a model takes time: by the time a chat reaches your hands, the data it was cooked with is already months old, sometimes more than a year.

Why it has to be this way

This isn't a flaw someone forgot to fix. It comes straight from how the model works, which is what we saw in E002. What the model "knows" was recorded in its parameters during training, and those knobs, once set, aren't re-tuned while you chat with it. Your conversation teaches it nothing new about the world; it only gives it context for that answer.

That's why it can't update itself. There's no silent process that, every night, puts the day's news into its head. For a model to know more recent things you have to train it again, or train a new one, and that's an enormous job that costs time and money. Collecting the text, cleaning it, ordering it and running the whole training isn't done on the fly. The cutoff date is, at bottom, the trace of that cost: it marks where the data collection was closed.

It's not a clean wall

Something that took me a while to catch: the cutoff date sounds like a sharp line —"it knows up to here, from here on it doesn't"—, but in practice it's blurrier than it seems. Different topics can end up frozen at slightly different moments, depending on how much and when each thing appeared in the text it was trained on. On a much-discussed subject the model may have fairly fresh information; on a more obscure one, data older than its official date suggests. There's research that has measured it: a model's effective cutoff isn't uniform and usually differs, topic by topic, from the announced date.

And there's something more uncomfortable still: the model doesn't always know exactly where its own knowledge ends. You can ask it what its cutoff date is and it'll give you an answer, but it isn't a fact it looks up on an internal clock, it's just one more thing it estimates. So it's best taken as a guideline, not a guarantee.

The misunderstanding worth undoing

For a while I took for granted that the AI "was up to date," that somehow it looked at the internet while answering me. It's exactly the opposite of what happens by default. Unless it has a search tool turned on, a language model answers only with what it learned up to its cutoff. It doesn't go out to the web at that moment; it recomposes the answer from what it has recorded.

And this links to an earlier step, the one on hallucination. If you ask it about something later than its cutoff date, two things can happen. The good one: it honestly tells you it doesn't know, that its knowledge reaches up to a certain point. The bad one, and very frequent: it fills the gap with whatever looks like it would go there and blurts it out with the same poise as ever. A recent fact the model can't know is the perfect terrain for it to make something up without warning. The cutoff date and hallucination go hand in hand: what falls beyond the cutoff is exactly what has the highest odds of coming out invented.

What it knows from memory and what it looks up

Now the piece that qualifies the above, because otherwise it would seem the chat lives locked in the past forever. Many chats can search the internet when needed. The difference lies in telling apart two things that are easy to confuse.

One is what the model knows from memory: limited by the cutoff date, recorded in its parameters, old by definition. The other is what it looks up in the moment: if it has search turned on, it goes out to the web, brings back fresh text and uses it to compose the answer. That is up to date, but not because the model "knows" more, but because it has just read it, just as you would read it. They're two different sources, and it isn't always obvious which one it's using in a given answer. Knowing both exist already puts you on guard.

What I do with this

The rule I ended up with is simple. For anything that depends on the moment —a price, a law, a news item, the latest version of a program, who won something— I don't trust the model from memory. Either I ask it expressly to search, or I verify it on my own before taking it as good. For what doesn't expire —how something is spelled, a general idea, a long-standing concept— the cutoff date hardly gets in the way, because that hasn't changed since the model learned it.

The cutoff date is one of the underlying reasons not to blindly believe a recent fact an AI gives you. It isn't that it lies on purpose; it's that, about what happened after its cutoff, it simply has nowhere to draw it from. Keeping it in mind changes what you ask it for help with and what you don't.

Here I'm talking about what the model remembers of the world, which comes from the factory and doesn't move. In the next step I stop on a very different memory: the one inside the conversation, what it retains while you talk to it. They're similar in name and have almost nothing to do with each other.

Definitions

- Cutoff date (knowledge cutoff): the last moment the model learned anything from during training. About what happened after, it knows nothing on its own. It's usually an old date, because training takes time. - Effective cutoff: the real date up to which the model knows a specific topic well, which may not match its official cutoff date. Different subjects end up frozen at slightly different moments. - Search tool: the ability some chats have to go out to the internet and bring back current text to compose the answer. When it's active, what the chat says can be up to date even if its memory isn't. - Knowledge from memory: what the model carries recorded in its parameters since training. Limited by the cutoff date. It doesn't update while you chat.

No comments yet

No comments yet. Be the first.