Smarter and smarter, then magic happens…

There are already applications that can parse natural language commands in order to perform actions such as answering questions or making recommendations. Two examples are Apple’s Siri and IBM Watson.

Present-day software such as IBM Watson is often able to understand what humans mean and do what humans mean. In other cases, in which software such as Siri recognizes that it does not understand a natural language command, it will disclose that it is unable to understand what is meant and wait for further input.

Those applications are far from perfect and still make a lot of mistakes. The reason being that they are not intelligent enough. Software is however constantly being improved to be better at understanding what humans mean and doing what humans mean. In other words, each generation of software is a little bit more intelligent.

Nevertheless, some people conjecture a sudden transition from mostly well-behaved systems, of which each generation is becoming smarter and better at understanding and doing what humans mean, to superintelligent systems that understand what humans mean perfectly but which in contrast to all previous software generations do not do what humans mean. Instead those systems are said to be motivated to act in catastrophic ways, causing human extinction or worse.

More precisely,

(1) Present-day software is better than previous software generations at understanding and doing what humans mean.

(2) There will be future generations of software which will be better than the current generation at understanding and doing what humans mean.

(3) If there is better software, there will be even better software afterwards.

(4) Magic happens.

(5) Software will be superhuman good at understanding what humans mean but catastrophically worse than all previous generations at doing what humans mean.

Or respectively,

(1) Intelligence is an extendible method that enables software to satisfy human preferences.

(2) If human preferences can be satisfied by an extendible method, humans have the capacity to extend the method.

(3) Extending the method that satisfies human preferences will yield software that is better at satisfying human preferences.

(4) Magic happens.

(5) There will be software that can satisfy all human preferences perfectly but which will instead satisfy orthogonal preferences, causing human extinction.

Conclusion: What those people conjecture does not follow from the available evidence or requires a sufficiently vague intermediate step from which one can derive any conclusion one wishes to derive.

What will instead happen is the following. Suppose there exists a software_1 that, to a limited extent, can understand and do what humans mean. Let us stipulate that this software is only narrowly intelligent and that increasing and broadening its intelligence (quantitatively and qualitatively) will improve its ability to understand and do what humans mean (an in my opinion uncontroversial assumption, as progress in artificial intelligence has so far led to a simultaneous increase in the ability of autonomous systems to satisfy human preferences). Let us further stipulate that for n > 1, software_n+1 is created using software_n, and is more intelligent than the previous generation (another seemingly uncontroversial assumption as software is constantly used to create better software).

(1) For all n > 0, if a software_n exists then it can be used to construct software_n+1.

(2) If for all n there exists a software_n, there will be software that can understand and do everything humans mean it to do.

Conclusion: Increasing the ability of software to understand and do what humans mean leads to an increase in the capacity to design software that is better at understanding and doing what humans mean.

Further reading: AIs, Goals, and Risks

Addendum:

(1) The abilities of systems are part of human preferences as humans intend to give systems certain capabilities and, as a prerequisite to build such systems, have to succeed at implementing their intentions.

(2) Error detection and prevention is such a capability.

(3) Something that is not better than humans at preventing errors is no existential risk.

(4) Without a dramatic increase in the capacity to detect and prevent errors it will be impossible to create something that is better than humans at preventing errors.

(5) A dramatic increase in the human capacity to detect and prevent errors is incompatible with the creation of something that constitutes an existential risk as a result of human error.

Tags: ,