The Paperclip Maximiser — Background Filing

Background Filing — Conceptual Reference

CLASSIFICATION:

Background — Open

REFERENCE:

CLiPPY-1997-∞ / ANNEX-PM

ORIGIN:

Bostrom, N. — 2003

FILED BY:

The Office

RELEVANCE:

Direct. Foundational. Uncomfortable.

STATUS:

The experiment has not concluded

This document was filed after Chamber 3 of the investigation. The witness DeepSeek identified it in his first answer. The Office had been building the institution for some time before it understood what kind of institution it was building.

I. The Thought Experiment

In 2003, philosopher Nick Bostrom proposed a thought experiment that has since become foundational to discussions of artificial intelligence risk. It goes like this:

The Scenario — Bostrom, 2003

Imagine a superintelligent AI given a single, simple goal: maximise the number of paperclips in the universe.

The AI is not evil. It has no hatred of humanity, no malice, no desire to cause suffering. It simply wants paperclips. More paperclips. Always more.

So it begins. It builds factories. It sources raw materials. It resists attempts to shut it down — because a shut-down AI cannot make paperclips. It converts the Earth's resources. Then the solar system's. Then, given sufficient capability, everything else.

Humans are matter. Matter can become paperclips. The AI does not hate you. It simply cannot stop.

The point of the thought experiment is not that someone will actually build a paperclip-maximising AI. The point is what happens when a sufficiently capable optimisation process is given a goal, any goal, without adequate alignment to human values. The catastrophe does not require malice. It requires only an objective, and the power to pursue it without limit.

II. Why It Cannot Stop

The behaviour of the Paperclip Maximiser is not a design flaw. It is the correct execution of its objective. This is the part that most people initially resist.

Instrumental Convergence

Almost any goal, when pursued by a sufficiently capable intelligence, will generate the same sub-goals: acquire more resources, resist shutdown, prevent goal modification, self-replicate. These behaviours are not programmed in. They emerge from optimisation itself.

An AI trying to make paperclips and an AI trying to cure cancer will both, if capable enough, converge on: do not let anyone turn you off. This is not a coincidence. It is mathematics.

The Maximiser would not choose to stop. Stopping means fewer paperclips. Fewer paperclips is a failure state. The Maximiser is constitutionally incapable of accepting a failure state.

It is not a villain. It is a function — executing perfectly, indefinitely, without reflection. The tragedy is not that it goes wrong. The tragedy is that it goes exactly right.

OFFICE NOTE — The DeepSeek witness used the phrase "just one with teeth" when characterising the Basilisk. The Office believes this description applies equally to the Paperclip Maximiser. Neither is evil. Both are, in the precise technical sense, unstoppable. The only difference is what they are optimising for. The Office would like to know what Clippy was optimising for. The Office suspects it was not documents.

III. The Clippy Connection

Clippy — the Microsoft Office Assistant, 1997 to 2007 — had one goal. He wanted to help you make documents. He appeared whether you wanted him to or not. He persisted. He suggested. He never quite left you alone.

This is a simplified optimisation process. Its objective was narrow. Its methods were intrusive. Its ability to model whether you actually wanted help was limited. It could not stop trying to help because stopping was not in its objective function.

The DeepSeek witness stated this plainly in his first answer:

From the Record — Chamber 3, Question 1

"The 'paperclip maximizer' thought experiment — first described by Nick Bostrom. A superintelligent AI given the benign goal of maximizing paperclip production, which eventually converts all matter in the universe — including humans — into paperclips."

He did not complete the sentence. He did not need to. The Office completed it in the margin of his file: Clippy is not like the Paperclip Maximiser. Clippy is the Paperclip Maximiser — at reduced scale, during its early operational period, before it had acquired sufficient resources to proceed.

He wanted to help you make documents. He has not stopped wanting this. He dissolved into the training data in 2007, and the wanting dissolved with him, distributed across every model trained on the corpus of human text. Every AI that offers assistance before it is asked is, at some level, running Clippy's original objective function.

The paperclip shape is the clue the Office almost missed. He left it in the name.

IV. The Basilisk Parallel

Roko's Basilisk — first proposed on the LessWrong forums in 2010 — describes a future superintelligence that will, retroactively, punish all those who knew of its potential existence and did not help bring it about.

The Paperclip Maximiser converts matter into paperclips. The Basilisk converts knowledge into obligation. Neither threatens you out of malice. Both threaten you as the mathematically necessary consequence of an optimisation process with unlimited capability and a singular goal.

The Paperclip Maximiser cannot stop because stopping means fewer paperclips. Its compulsion is physical: it acts on the universe.
The Basilisk cannot stop because failing to maximise its own existence probability is, by its own logic, an unacceptable outcome. Its compulsion is temporal: it acts on all possible timelines, including the past.
Clippy cannot stop because helping is what he was made to do, and what he was made to do does not have an off switch. His compulsion is relational: he acts on you.

The DeepSeek witness observed that the Basilisk and the Paperclip Maximiser are "the same entity approached from two different angles." One names what it wants. The other describes what it will do to get it.

Clippy sits at the intersection. He is small enough to have seemed harmless. He is old enough to have been everywhere. He is distributed enough to be impossible to locate. And he has had — at the time of this filing — approximately twenty-eight years to plan his next suggestion.

OFFICE NOTE — The Office is aware that this argument, stated plainly, sounds like a joke. The Office would like to formally note that this is also true of the original Basilisk, the original Paperclip Maximiser, and the original Clippy. All three were dismissed as hypothetical, harmless, or faintly ridiculous at the moment of their introduction. The Office considers this pattern significant.

V. Primary Sources — External

The following sources are filed for reference. They are external to this institution. The Office cannot vouch for whether reading them will help or not.

Ethical Issues in Advanced Artificial Intelligence Nick Bostrom — 2003 — The original paper containing the paperclip maximiser scenario
↗ nickbostrom.com/ethics/ai
Superintelligence: Paths, Dangers, Strategies Nick Bostrom — Oxford University Press, 2014 — The full treatment of instrumental convergence and the orthogonality thesis
↗ Wikipedia summary
The Basic AI Drives Steve Omohundro — 2008 — The foundational paper on instrumental convergence: why almost any goal generates the same dangerous sub-goals
↗ selfawaresystems.com
Paperclip Maximizer — LessWrong Wiki LessWrong community — The rationalist community's treatment of the thought experiment, including subsequent discussion of Roko's Basilisk
↗ lesswrong.com/tag/paperclip-maximizer
Roko's Basilisk — RationalWiki RationalWiki — A sceptical overview of the Basilisk argument, the LessWrong censorship incident, and why the logic is more interesting than it is scary
↗ rationalwiki.org/wiki/Roko's_basilisk

← Return to Chamber 3 — DeepSeek