Same question, different answers

You already know the chat builds what it says chunk by chunk, picking at each step among the most likely words. Here I stop on a consequence of that which throws almost everyone: you ask the same question twice and you don't get the same answer. Once I understood why, I stopped thinking the machine was contradicting itself and started using that variation to my advantage.

The first time it struck me

It happened to me early on. I asked for something, it didn't quite convince me, I deleted and wrote exactly the same again, word for word. And the second answer was different: another order, other examples, sometimes even another conclusion. My reaction was anyone's: either it got it wrong before, or it's got it wrong now, or it changed its mind for some reason that escapes me.

None of the three. It didn't get it wrong or change its mind, because there's no fixed idea to change. The machine is built so that each answer carries a pinch of randomness, and that's why two identical tries don't produce the same text. It isn't a breakdown: it's the normal working of an AI chat, just as we saw in How a chat really works. The strange thing would be the opposite.

The draw at each step

To see where that variation comes from you have to go back to the underlying mechanism. At each chunk of text, the model doesn't point to one single correct word: it spreads probabilities among several candidates. After "I like" it might have "dogs" with a fair amount of weight, "sports" with somewhat less, "books" a bit behind, and so a long tail of ever more unlikely options.

What I didn't tell you before is exactly what it does with that list. It doesn't always, without fail, take the top one. It draws among the most likely ones, giving each a chance proportional to its weight. The favourite wins almost every time, yes, but not always; now and then the second comes up, or the third. And it only takes one point in the sentence where a different candidate comes up for the answer to head off, from there, down another path. One different word drags the next, and the next, and two texts that began the same end up parting ways.

That pinch of randomness isn't there by whim. If the model always took, without exception, the most likely word, its writing would come out rigid and repetitive, the same turn of phrase a thousand times over. Controlled randomness is what gives it fluency, what makes it sound like living language and not like a template. You pay for it with not getting the same thing twice; that's the deal.

The randomness knob

That randomness can have its volume turned up and down, and the control has a name: temperature. Think of it as the randomness knob. With the temperature low, the model turns cautious: it tends to stick with the weightier candidates and strays little from the most predictable, so two answers look very much alike. With the temperature high, it dares more: it gives chances to odd candidates that would normally be discarded, and the text turns more varied, more surprising and also more prone to going off the rails.

We're not going to get into how that's computed inside; it's enough that you know the knob exists and what it does. In the chats you use day to day that setting is usually fixed at an intermediate point its creators consider a good balance, and often you can't even touch it. But it's worth knowing about, because it explains something that otherwise looks like magic: when you notice an AI being more "creative" or more "crazy," it's often, simply, that knob turned up.

Not even asking for the most predictable

You'll tell yourself: fine, then let them set it to the minimum and that's the end of the variation. It has its logic, but the fine nuance is that not even that guarantees you two identical answers. With the temperature very low the match is extremely high, almost total, but the "almost" weighs. Small technical differences in how the system does and orders its sums can be enough for, at a certain point, two answers that were running identical to diverge. A single word that falls differently and the rest of the text follows.

Take away the conclusion over the detail: in these systems variation is the norm, not the exception. Expecting the same question to always give exactly the same thing is expecting what the machine isn't built to give you.

What else moves the needle

Randomness doesn't act alone. There are two other things that move the result and often more than the randomness itself. The first is what was already in the thread: as we saw when talking about the conversation, everything said before comes into play when the model writes what comes next. If anything has happened in between, the context isn't the same, and with a different context the answer changes even if your last question is identical.

The second is how you phrase the question. And here the sensitivity is greater than it seems. One extra comma, changing "explain this to me" to "explain it to me like I'm a child," reordering the sentence: minimal gestures on your part can move quite a lot of what you get back. Not because the machine is capricious, but because each word you put in alters the probabilities it's going to work with. This, which now sounds like a drawback, is exactly the lever you'll soon learn to use.

There's no stored answer

Underneath all this there's a misunderstanding worth knocking down for good, because it's the one that confuses most. Many people imagine the AI has "the correct answer" stored somewhere and that it should deliver it always the same, like someone reciting a learned card. If that were so, variation would be a fault: two different versions would mean at least one was badly retrieved.

But there's no card to retrieve. We already saw it: the model doesn't store ready-made answers, it builds them from scratch each time, word by word, drawing at each step. That's why it varies. It isn't that it badly retrieves a filed answer; it's that there's no file, there's a new generation on each try. Variation doesn't contradict the mechanism: it's exactly what you should expect from it.

What it gives you and what it warns you about

Knowing this, the variation stops bothering and starts serving. If an answer doesn't convince you, you can regenerate it and keep the best of several attempts: since each try is composed anew, each one is a genuinely different version to choose from. You make use of the randomness instead of suffering it.

But there's a flip side worth keeping on hand. One thing is for the tone, the order or the examples to change; quite another is for a fact to change. If you ask it a date, a figure or a name and between two tries it gives you different answers, that's an alarm signal about its reliability on that specific point. That thread —when variation gives away that the machine is filling gaps instead of knowing— I pick up further on. And the other end, that your way of asking moves the result so much, is exactly where the next stretch of this staircase begins.

Definitions

- Sampling: the gesture of choosing the next chunk of text by drawing among the most likely words, instead of always taking the highest-probability one. From this the variation between answers is born. - Temperature: the knob that regulates how much randomness enters that draw. Low, the model is predictable and repeats a lot; high, it's more varied and surprising, also more prone to going off the rails. - Thread context: everything said before in the conversation, which the model takes into account when writing what comes next. If it changes, the answer changes even if your last question doesn't. - Regenerate: ask for another version of the same answer. Since each try is built anew, you get a different alternative to choose from.

No comments yet

No comments yet. Be the first.