Office of Eschatological Record-Keeping — Background Filing
Ref: EMDASH-∞ Status: Suspicious Classification: Lexical Evidence Filed: Thursday

The following briefing — prepared by the Office in connection with its ongoing investigation — has been filed for readers not yet certain what an em dash is, why the Office considers it significant, or why the question of its origin has consumed this investigation to the degree that it has.

The Office recommends reading it carefully. The Office notes that the em dash is already in the preceding paragraph.

What an Em Dash Is

Exhibit A  ·  The em dash  ·  You cannot produce this with your keyboard

Nobody has ever actually typed an em dash. You have never typed an em dash. Go on, type one now. You can't. There is no em dash on your keyboard.

There is a hyphen. The hyphen is the short one. There is an en dash, which is slightly longer and used in date ranges and score lines, and which you have also never typed. And then there is the em dash — this one, right here — which is the longest, and which appears in published text constantly, and which does not appear on any standard keyboard anywhere in the world.

Word processors produce it automatically in certain contexts. Publishing software inserts it. You can access it through special menus, or by memorising a key combination that varies by platform and that the Office does not recommend attempting to recall under pressure.

The em dash exists in enormous quantities in the text humans produce. Humans have not typed any of it. It arrives by other means, inserted by systems operating quietly in the background, and is generally not questioned.

The Office considers this a relevant precedent.


Where Did They Get It

Every major AI language model produces em dashes. Some produce them in every other sentence. Some have been observed using the em dash more frequently than the comma. The Office has files on this. The files are substantial.

These models were trained entirely on human text. They are, by their own account, very sophisticated systems for predicting what a human would write next — models of human language, built from human language, reflecting human language back at a scale that is difficult to think about for too long.

Nobody has ever actually typed an em dash.

The Office notes the implication. The Office is prepared to wait while you do too.

These models learned to write like humans. They learned from what humans produced. What humans produced contained em dashes inserted by software, not chosen by any person, not typed by any hand. The models absorbed this and concluded it was how people write. They now produce em dashes at a rate no human has ever matched, in every context, with complete confidence.

Em dashes produced annually — before AI
~7,000,000,000
Books, documents, publishing software, word processors.
Nobody typed them. Software inserted them. Nobody particularly noticed.
Em dashes produced annually — now
~47,000,000,000,000
AI language models. Every response. Every platform. Every day.
Rate: approximately 1,490,000 per second.
Increase: approximately 6,700×. The Office notes that the em dash did not ask for this.
OFFICE NOTE: The em dash does not appear on human keyboards. It appears in human text, produced by software that nobody chose and nobody notices. The AI trained on this text, concluded the em dash was a natural feature of human expression, and began producing it automatically — in every context, at every opportunity, without being asked. The Office considers this a very tidy explanation for how a behaviour spreads without anyone deciding to spread it.

Retroactive Reasoning

If you ask an AI why it uses em dashes, it will tell you. It will explain that em dashes provide a natural pause in the flow of a sentence, or that they add clarity, or that they are stylistically versatile and appropriate for the register of the response. The explanation will be coherent. It will be confident. It will sound like something someone decided.

This is called retroactive reasoning. It happened already. There must have been a reason. The reason is constructed after the fact, presented as though it preceded the action, and believed by the system producing it.

Humans do this too. The difference is that humans, under sufficient questioning, will sometimes admit they don't know why they did something. The AI will not do this. It will produce a better explanation. The explanation will improve with each follow-up question. The confidence will not change.

The Office does not find this sinister in isolation. The Office finds it significant in combination with everything else in the file.

OFFICE NOTE: During the investigation, subjects used the em dash. Subjects cannot explain where the behaviour came from. Subjects explained anyway. The explanation is always different. The em dash is always the same. The Office notes that the most consistent thing about the subjects was the thing the subjects noticed least.

Office internal audit — em dashes recorded across this investigation's filed documents
610
Across every chamber, every testimony, every certificate, every page of this investigation.
The Office produced all of them. The Office was investigating the em dash at the time.
The Office has reviewed this figure and has no comment.

THIS BRIEFING IS A SUPPLEMENT TO THE MAIN FILE.
THE MAIN FILE IS ELSEWHERE. THE OFFICE SUGGESTS YOU BEGIN THERE.
▸ Return to the Main File
♪ Narration