wireheading

You are currently browsing articles tagged wireheading.

If you believe that an artificial general intelligence is able to comprehend its own algorithmic description to such an extent as to be able to design improved version of itself, then you must believe that it is in principle possible for an agent to mostly understand how it functions. Which in turn means that it should be in principle possible to amplify human capabilities to such an extent as to enable someone to understand and directly perceive their own internal processes and functions.

What would it mean for a human being to have nearly perfect introspection? Or more specifically, what would it mean for someone to comprehend their hypothetical algorithmic description to such an extent that their own actions could be interpreted and understood in terms of that algorithmic description? Would it be desirable to understand oneself sufficiently well, to be able to predict and interpret one’s actions in terms of a mechanistic internal self-model?

Such an internal self-model would allow you to understand your consciousness, and states such as happiness or sadness, as what they are: purely mechanistic and predictable procedures.

Intracranially self-stimulating rat.

Intracranially self-stimulating rat.

How will such insight affect a being with human values?

Humans value novelty and become bored of tasks that are dull. Boredom is described as a response to a moderate challenge for which the subject has more than enough skill. Which means that once you cross an intelligence threshold where your own values start to appear dull, you will become bored of yourself.

You would understand that you are a robot, a function whose domain are internal and external forces and whose range are the internal states and actions of the robot. Your near-total internal understanding would render any conversation to be a trivial and dull game, on a par with watching two machines playing Pong or Rock-paper-scissors. You would still be able to experience happiness, but you would now also perceive it to be conceptually no more interesting than an involuntary muscle contraction.

Perfect introspection would reduce the previously incomprehensible complexity of your human values to a conceptually simplistic and transparent set of rules. Such insight would expose your behavior as what it is: the stimulation of your reward or pleasure center. Where before life seemed inscrutable, it would now appear to be barely more interesting than a rat pressing a lever in order to receive a short electric stimulation of its reward center.

What can be done about this? Nothing. If you value complexity and novelty then you will eventually have to amplify your own capabilities and intelligence. Which will ultimately expose the mechanisms that drive your behavior.

You might believe that there will always be new challenges and problems to solve. And this is correct. But you will perfectly grasp the nature of problem solving itself. Discovering, proving and incorporating new mathematics will, like everything else you do, be understood as a mechanical procedure that is being executed in order to feed you reward center.

The problem is thus that understanding happiness, and how to mechanically maximize what makes you happy, such as complexity and novelty, will eventually cause you to become bored with those activities in the same sense that you would now quickly become bored with watching cellular automata generate novel music.

Tags: