Humour and mental structure. The joke AI explains but doesn't laugh at

Put before a large model the oldest joke in world literature — the one from the Sumerian tablet of 1900 BC, about the young woman who has never farted in her husband's lap — and watch what happens. The machine explains it. It talks about the social role, about the incongruity with bodily vulgarity, it notes that scatological humour appears in every culture. It explains it perfectly. It doesn't laugh. You probably don't either. And yet you understand the joke, or you think you understand it, which is a fairly curious claim: if understanding it doesn't produce the effect, what was it you understood?

This is the discomfort humour introduces into any definition of understanding propped up on the brain alone. The AI industry dodges the question because the answer threatens half a dozen evaluation benchmarks.

Koestler and the collision of two planes

Arthur Koestler published The Act of Creation in 1964, a six-hundred-page brick almost nobody has read whole and from which almost everybody has quoted three ideas. The main one is bisociation (the simultaneous association of a single object with two mental frames that normally don't touch). Koestler said any creative act — joke, scientific discovery, poetic metaphor — works the same way. You have two independent frames of reference, two mental matrices each governing its own territory without interfering, and suddenly a single object belongs to both at once. The matrices collide. That collision, depending on context, produces laughter, idea, or aesthetic emotion.

The joke is the easiest case to see. You begin a tale inside one frame — the husband, the lap, the domestic scene — and the last word shoves you into another: the body, the fart, the physiological mechanics. You don't abandon the first. You hold both at once for an instant, and that instant of overlap discharges as laughter. Koestler calls it bisociation to set it apart from ordinary association, which moves the mind from one point to another within a single plane.

The funny thing, never more aptly said, is that the theory is both obvious and treacherous.

Obvious because anyone analysing a joke sees the two planes. Treacherous because explaining the joke doesn't produce the effect. It's precisely what kills the joke. If you have to explain it, it no longer works. The cognitive structure is intact and the mechanics have stopped. This should have been enough, back in 1964, to suspect that incongruity is a necessary but not sufficient condition. Sixty years later we still haven't resolved it, and in the meantime we've built machines that detect incongruities better than we do without ever laughing.

The other theories and why together they still don't suffice

Before Koestler there were three canonical ways to explain laughter.

Hobbes, in Leviathan (1651), said we laugh from a sudden feeling of superiority: someone trips, someone is made ridiculous, and we, noticing it isn't us, let out the laugh of one who knows they're safe. It captures cruel laughter perfectly, the schoolyard kind, the humiliating-meme kind. It doesn't capture the victimless verbal joke, nor the baby you play peekaboo with.

Freud, in Jokes and Their Relation to the Unconscious (1905), proposed the relief theory. The joke releases repressed psychic energy — sexual, aggressive, social — by dodging the superego's censorship. Laughter is the discharge. It works for scatological, sexual or taboo humour; it leaves in shadow the mathematical joke, the pun (a play on words based on the double reading of a sequence of sounds), pure absurdity.

Bergson, in Le Rire (1900), added the mechanical theory. We laugh when the living behaves like a machine, when a human body turns rigid, repetitive, automatic. The comic always falls the same way. Laughter, Bergson said, is society's sanction against rigidity. It captures physical comedy, the Chaplin and Keaton kind, and falls short on everything verbal.

The benign violation

The most recent theory, the benign violation of McGraw and Warren published in Psychological Science in 2010, picks up a bit of all of them. Something is funny when it violates a norm — moral, logical, social, physical — but the violation is harmless enough not to produce alarm. Too mild, dull. Too grave, indignation or disgust. The funny lives in a narrow band. The theory is useful because it explicitly introduces the affective dimension: the joke isn't only structure, it's structure calibrated against an emotional threshold that each culture and each group sets differently.

Put the four together and you have a good functional description of humour. What you don't have is why the functional description produces laughter in one body and not in another.

Laughter is of the body, not the brain

Robert Provine spent years recording real laughter in the street, in bars, in offices, and published in 2000 Laughter: A Scientific Investigation. His main finding was unfriendly to everything the psychology of humour had been saying until then. Almost nobody laughs at a joke. The vast majority of everyday laughter occurs in response to banal phrases — "did you get there OK?", "see you later" — and operates as a social gesture, glue of the group, bodily synchronisation among those present. Closer to the coordinated barking of a pack than to the aesthetic judgement of the literary critic.

Provine also documented what everyone knows but few take seriously. Laughter is contagious. You laugh more when others laugh. You laugh even at things that, alone and cold, wouldn't amuse you. Canned laughter exists because it works, and it works because your organism is wired to couple its facial musculature to the one next to it. The guffaw isn't the verification of a judgement. It's a motor behaviour with a communicative function, evolutionarily older than language.

This drastically changes the question.

There's no cognitive module of humour that detects the incongruity and sends the order "laugh". There's a body in a group, with a motor system ready to synchronise guffaws, and that body laughs when the internal calibration meets the external one. The well-built incongruity is the pretext. Social coupling is the cause. That's why a joke amuses you with your friends and stops amusing you when you try to tell it to your father. That's why laughing at yourself is hard. That's why, when you stop laughing with someone, you already know the relationship is over, long before any word has been spoken.

Humour as a group passport

Tell a local joke at an international table. Only two people laugh. You've just marked the border of the group with a precision no sociological questionnaire matches.

Whoever laughs belongs. Whoever doesn't is outside, and knows it in the same instant. Humour is, among other things, a proof of membership. You recognise the reference, you catch the nuance, you move within the shared code. If all that happens effortlessly and it also amuses you, you're in the group. If it has to be explained to you, you aren't.

Humour as a professional password

That's why there's a humour of profession — doctors' about patients, programmers' about code, soldiers' about death — that to outsiders is incomprehensible or downright repugnant. It isn't that the joke is worse. It's that the joke isn't for you. It's built as a password. Laughing correctly declares that you've done enough shifts, deployments or corpses.

When a joke travels too far, it stops working as an includer. It turns neutral, grey, childish. The mass-entertainment industry needs to produce that neutrality to sell to everyone, and that's why international comedy lowers humour to the lowest common denominator, where nobody is left out and, for exactly the same reason, nobody is left inside anything.

Here AI enters. A large model has read more jokes than any human. It knows Sumerian humour, Brooklyn Jewish humour, trench humour, Twitter humour, medical, mathematical. All of them. No filter. No membership. No group. That's why, even though it can generate a technically correct joke on any topic, that joke is never anyone's, marks a border with nobody, and produces that unmistakable sensation of plastic, of a thing thought up not to offend. AI doesn't laugh at the jokes it tells for the same reason a professional translator isn't moved by the poem on the twelfth reading. It's too deep inside the mechanics. And, above all, it belongs to no group.

What large models do and what they don't

Hessel and Lee published in 2023, at the ACL conference, an honest study on humour comprehension by large language models (text-generating programs trained on massive amounts of human writing, known in English as LLM) using The New Yorker magazine's caption contest. The conclusion, after several thousand evaluations, was moderate. The models do well identifying which cartoon goes with which caption. Worse explaining why one caption is funny and the others aren't. And badly generating original captions a human ranks among the funny ones. The pattern repeats in other evaluation studies. The model catches the structure, fails the fine calibration.

What works in current models is what you'd expect: formal structure, puns, syntactic parallelisms, obvious inversions. What fails is what wasn't hard to foresee either — the timing, the exact ellipsis, the precise amount of withheld information, the closeness to taboo without falling in, the real surprise against a background of real expectation. The joke lives in the margins and the models are trained, by construction, to head for the centre of the statistical distribution of human responses. Ask it for a joke and you get a reasonable joke. A reasonable joke isn't a joke.

There's a deeper level the benchmarks don't touch. Even if the model improved, even if it could generate captions indistinguishable from the human author's, it still wouldn't laugh. The question is whether the laughter matters. If humour were a cognitive module, laughter would be an epiphenomenon (a side effect that accompanies the main phenomenon but isn't part of it): the brain recognises the incongruity, ticks the box, and then, optionally, the body produces the motor response. In that world, an AI that detected the incongruity would understand the joke fully, and not laughing would be an implementation detail.

But Provine's data suggest otherwise. They suggest laughter isn't an epiphenomenon but the heart of the phenomenon, and that recognising the incongruity is only the fuse. Without combustion, the fuse is nothing.

The question that folds

Here's the neck of the matter.

If understanding a joke is only identifying two frames in collision and describing how they collide, AI understands every joke. Better than you. Effortlessly, without membership, without error. If understanding it includes feeling the bodily discharge, the coupling with a present group and the fine calibration of the threshold between violation and benignity, AI understands none and probably never will, because it lacks the physical and social conditions.

What it costs to sign either one

Both options are uncomfortable and that's why almost nobody signs them clearly. The first forces you to accept that the best intelligence available never laughs, and that this absence is a real gap, not an aesthetic detail. The second forces you to accept that understanding, at least in this territory, isn't separable from the body, and therefore isn't something a body-less machine can reach no matter how many parameters and data you give it. Either one breaks the comfortable idea that AI is going to understand us progressively better as we train bigger models.

Humour isn't a side caprice in this discussion. It's the cleanest test. There's no other zone of the human mind where the dissociation between technical fluency and real understanding is so visible and so fast. A badly copied philosophical paragraph shows on the third reading. A badly told joke shows on the first and produces instant secondhand embarrassment, which is the signal that the listener's body has detected something the teller's body wasn't feeling.

While the benchmarks try to measure comprehension with multiple-choice tests, there's an older and crueller domestic test. Tell a joke. If the listener laughs without your having to explain anything, they understood. If not, they didn't. And if the teller is a machine, it isn't going to laugh before you do to confirm the success. That, in the joke, changes everything.

Definiciones

Bisociation. A term coined by Arthur Koestler in The Act of Creation (1964) to name the mental operation that holds two normally incompatible frames of reference simultaneously. It's the common mechanism, according to Koestler, of the joke, the scientific discovery and the poetic metaphor.

Benign violation. A theory of humour proposed by A. P. McGraw and C. Warren (2010) according to which something is comic when it breaches a norm without producing real threat. The band between the too-mild and the too-grave is narrow and depends on the observer.

Large language model. A computer system trained on large amounts of human text to predict and generate natural language. In English, Large Language Model or LLM. ChatGPT, Claude and Gemini are examples.

Epiphenomenon. An effect that accompanies a causal process without being part of it or intervening in its result. In philosophy of mind, it's debated whether consciousness is an epiphenomenon of the neural process or an active component of it.

Pun. A verbal play that exploits the double reading of a sequence of sounds or syllables to produce an unexpected sense. It's one of the oldest comic devices and one of those the large models reproduce best.

Referencias

Koestler, A., The Act of Creation (Hutchinson, 1964). Source of the concept of bisociation and of the reading of the joke as a particular case of the creative act.

Bergson, H., Le Rire. Essai sur la signification du comique (Alcan, 1900). Classic treatise on the mechanical theory of laughter.

Freud, S., Der Witz und seine Beziehung zum Unbewussten (Deuticke, 1905). Theory of the joke as a discharge of repressed psychic energy.

Hobbes, T., Leviathan (1651). Classic formulation of the superiority theory as the origin of laughter.

Provine, R. R., Laughter: A Scientific Investigation (Viking, 2000). Empirical basis for the reading of laughter as a social motor behaviour, not as a response to cognitive incongruity.

McGraw, A. P. and Warren, C., «Benign Violations: Making Immoral Behavior Funny», Psychological Science 21 (2010), pp. 1141-1149. Contemporary formulation of the benign violation theory.

Hessel, J., Marasović, A., Hwang, J. D., Lee, L. et al., «Do Androids Laugh at Electric Sheep? Humor 'Understanding' Benchmarks from The New Yorker Caption Contest», Proceedings of ACL (2023). Available at arXiv:2209.06293. Empirical evaluation of large models' performance on humour tasks.

También te interesa

En otros sitios

#anthropomorphism #intelligence #reasoning #benchmarks #papers