Related to: Distilling the “dumb superintelligence” argument
To steelman: the act of figuring out even better arguments for your opponents’ positions while arguing with them and to beat those arguments rather than only their actual arguments or their weakest arguments (weak-manning) or caricatures of their arguments (straw-manning). [source]
Someone called Xagor et Xavier again commented on one of my posts with a better and more concise formulation of my some of my arguments. If that person believes those arguments to be flawed (I do not know if they do) then that would increase my confidence in being wrong, since in order to rephrase my arguments more clearly they obviously have to understand what I am arguing. But at the same time I am also confident that much smarter people than me, especially experts, could think of much stronger arguments against the case outlined by some AI risk advocates.
My own attempt at steelmanning the arguments of AI risk advocates can be found in my primer on risks from AI.
In this post I attempt to improve upon the refinement of the “dumb superintelligence” argument outlined in my last post.
Argument: Fully intended behavior is a very small target to hit.
(1) General intelligence is a very small target to hit, requiring a very small margin of error.
(2) Intelligently designed systems do not behave intelligently as a result of unintended consequences.
(3) By step 1 and 2, for an AI to be able to outsmart humans, humans will have to intend to make an AI capable of outsmarting them and succeed at encoding their intention of making it outsmart them.
(4) Intelligence is instrumentally useful because it enables a system to hit smaller targets in larger and less structured spaces.
(5) In order to take over the world a system will have to be able to hit a lot of small targets in very large and unstructured spaces.
(6) The intersection of the sets of “AIs in mind design space” and “the first probable AIs to be expected in the near future” contains almost exclusively those AIs that will be designed by humans.
(7) By step 6, what an AI is meant to do will very likely originate from humans.
(8) It is easier to create an AI that applies its intelligence generally than to create an AI that only uses its intelligence selectively.
(9) An AI equipped with the capabilities required by step 5, given step 7 and 8, will very likely not be confused about what it is meant to do if it was not meant to be confused.
(10) Therefore the intersection of the sets of “AIs designed by humans” and “dangerous AIs” only contains almost exclusively those AIs which are deliberately designed to be dangerous by malicious humans.
 Software such as Mathematica will not casually prove the Riemann hypothesis if it has not been programmed to do so. Given intelligently designed software, world states in which the Riemann hypothesis is proven will not be achieved if they were not intended because the nature of unintended consequences is overall chaotic.
 As the intelligence of a system increases the precision of the input, that is necessary to make the system do what humans mean it to do, decreases. For example, systems such as IBM Watson or Apple’s Siri do what humans mean them to do when fed with a wide range of natural language inputs. While less intelligent systems such as compilers or Google Maps need very specific inputs in order to satisfy human intentions. Increasing the intelligence of Google Maps will enable it to satisfy human intentions by parsing less specific commands.
 For an AI to misinterpret what it is meant to do it would have to selectively suspend using its ability to derive exact meaning from fuzzy meaning, which is a significant part of general intelligence. This would require its creators to restrict their AI and specify an alternative way to learn what it is meant to do (which takes additional, intentional effort). Because an AI that does not know what it is meant to do, and which is not allowed to use its intelligence to learn what it is meant to do, would have to choose its actions from an infinite set of possible actions. Such a poorly designed AI will either (a) not do anything at all or (b) will not be able to decide what to do before the heat death of the universe, given limited computationally resources. Such a poorly designed AI will not even be able to decide if trying to acquire unlimited computationally resources was instrumentally rational because it will be unable to decide if the actions that are required to acquire those resources might be instrumentally irrational from the perspective of what it is meant to do.
Comments are now closed.