Is an Intelligence Explosion a Disjunctive or Conjunctive Event?

…an intelligence explosion may have fair probability, not because it occurs in one particular detailed scenario, but because, like the evolution of eyes or the emergence of markets, it can come about through many different paths and can gather momentum once it gets started. Humans tend to underestimate the likelihood of such “disjunctive” events, because they can result from many different paths (Tversky and Kahneman 1974). We suspect the considerations in this paper may convince you, as they did us, that this particular disjunctive event (intelligence explosion) is worthy of consideration.

It seems to me that all the ways in which we disagree have more to do with philosophy (how to quantify uncertainty; how to deal with conjunctions; how to act in consideration of low probabilities) […] we are not dealing with well-defined or -quantified probabilities. Any prediction can be rephrased so that it sounds like the product of indefinitely many conjunctions. It seems that I see the “SIAI’s work is useful scenario” as requiring the conjunction of a large number of questionable things […]

— Holden Karnofsky, 6/24/11 (GiveWell interview with major SIAI donor Jaan Tallinn, PDF)

Disjunctive arguments

People associated with the Singularity Institute for Artificial Intelligence (SI / SIAI) like to claim that the case for risks from AI is supported by years worth of disjunctive lines of reasoning. This basically means that there are many reasons to believe that humanity is likely to be wiped out as a result of artificial general intelligence. More precisely it means that not all of the arguments supporting that possibility need to be true, even if all but one are false risks from AI are to be taken seriously.

The idea of disjunctive arguments is formalized by what is called a logical disjunction. Consider two declarative sentences, A and B. The truth of the conclusion (or output) that follows from the sentences A and B does depend on the truth of A and B. In the case of a logical disjunction the conclusion of A and B is only false if both A and B are false, otherwise it is true. Truth values are usually denoted by 0 for false and 1 for true. A disjunction of declarative sentences is denoted by OR or ∨ as an infix operator. For example, (A(0)∨B(1))(1), or in other words, if statement A is false and B is true then what follows is still true because statement B is sufficient to preserve the truth of the overall conclusion.

Generally there is no problem with disjunctive lines of reasoning as long as the conclusion itself is sound and therefore in principle possible, yet in demand of at least one of several causative factors to become actual. I don’t perceive this to be the case for risks from AI. I agree that there are many ways in which artificial general intelligence (AGI) could be dangerous, but only if I accept several presuppositions regarding AGI that I actually dispute.

By presuppositions I mean requirements that need to be true simultaneously (in conjunction). A logical conjunction is only true if all of its operands are true. In other words, the a conclusion might require all of the arguments leading up to it to be true, otherwise it is false. A conjunction is denoted by AND or ∧.

Now consider the following prediction: <Mary is going to buy one of thousands of products in the supermarket.>

The above prediction can be framed as a disjunction: Mary is going to buy one of thousands of products in the supermarket, 1.) if she is hungry 2.) if she is thirsty 3.) if she needs a new coffee machine. Only one of the 3 given possible arguments need to be true in order to leave the overall conclusion to be true, that Mary is going shopping. Or so it seems.

The same prediction can be framed as a conjunction: Mary is going to buy one of thousands of products in the supermarket 1.) if she has money 2.) if she has some needs 3.) if the supermarket is open. All of the 3 given factors need to be true in order to render the overall conclusion to be true.

That a prediction is framed to be disjunctive does not speak in favor of the possibility in and of itself. I agree that it is likely that Mary is going to visit the supermarket if I accept the hidden presuppositions. But a prediction is only at most as probable as its basic requirements. In this particular case I don’t even know if Mary is a human or a dog, a factor that can influence the probability of the prediction dramatically.

The same is true for risks from AI. The basic argument in favor of risks from AI is that of an intelligence explosion, that intelligence can be applied to itself in an iterative process leading to ever greater levels of intelligence. In short, artificial general intelligence will undergo explosive recursive self-improvement.

Hidden complexity

Explosive recursive self-improvement is one of the presuppositions for the possibility of risks from AI. The problem is that this and other presuppositions are largely ignored and left undefined. All of the disjunctive arguments put forth by SI are trying to show that there are many causative factors that will result in the development of unfriendly artificial general intelligence. Only one of those factors needs to be true for us to be wiped out by AGI. But the whole scenario is at most as probable as the assumption hidden in the words <artificial general intelligence> and <explosive recursive self-improvement>.

<Artificial General Intelligence> and <Explosive Recursive Self-improvement> might appear to be relatively simple and appealing concepts. But most of this superficial simplicity is a result of the vagueness of natural language descriptions. Reducing the vagueness of those concepts by being more specific, or by coming up with technical definitions of each of the words they are made up of, reveals the hidden complexity that is comprised in the vagueness of the terms.

If we were going to define those concepts and each of its terms we would end up with a lot of additional concepts made up of other words or terms. Most of those additional concepts will demand explanations of their own made up of further speculations. If we are precise then any declarative sentence (P#) (all of the terms) used in the final description will have to be true simultaneously (P#∧P#). And this does reveal the true complexity of all hidden presuppositions and thereby influence the overall probability, P(risks from AI) = P(P1∧P2∧P3∧P4∧P5∧P6∧…). That is because the conclusion of an argument that is made up of a lot of statements (terms) that can be false is more unlikely to be true since complex arguments can fail in a lot of different ways. You need to support each part of the argument that can be true or false and you can therefore fail to support one or more of its parts, which in turn will render the overall conclusion false.

To summarize: If we tried to pin down a concept like <Explosive Recursive Self-Improvement> we would end up with requirements that are strongly conjunctive.

Making numerical probability estimates

But even if SI was going to thoroughly define those concepts, there is still more to the probability of risks from AI than the underlying presuppositions and causative factors. We also have to integrate our uncertainty about the very methods we used to come up with those concepts, definitions and our ability to make correct predictions about the future and integrate all of it into our overall probability estimates.

Take for example the following contrived quote:

We have to take over the universe to save it by making the seed of an artificial general intelligence, that is undergoing explosive recursive self-improvement, extrapolate the coherent volition of humanity, while acausally trading with other superhuman intelligences across the multiverse.

Although contrived, the above quote does only comprise actual beliefs hold by people associated with SI. All of those beliefs might seem somewhat plausible inferences and logical implications of speculations and state of the art or bleeding edge knowledge of various fields. But should we base real-life decisions on those ideas, should we take those ideas seriously? Should we take into account conclusions whose truth value does depend on the conjunction of those ideas? And is it wise to make further inferences on those speculations?

Let’s take a closer look at the necessary top-level presuppositions to take the above quote seriously:

1: Within the lesswrong/SI community the many-worlds interpretation of quantum mechanics is proclaimed to be the rational choice of all available interpretations. How to arrive at this conclusion is supposedly also a good exercise in refining the art of rationality.

2: P(Y|X) ≈ 1, then P(X∧Y) ≈ P(X)

In other words, logical implications do not have to pay rent in future anticipations.

3: “Decision theory is the study of principles and algorithms for making correct decisions—that is, decisions that allow an agent to achieve better outcomes with respect to its goals.”

4: “Intelligence explosion is the idea of a positive feedback loop in which an intelligence is making itself smarter, thus getting better at making itself even smarter. A strong version of this idea suggests that once the positive feedback starts to play a role, it will lead to a dramatic leap in capability very quickly.”

To be able to take the above quote seriously you have to assign a non-negligible probability to the truth of the conjunction of #1,2,3,4, 1∧2∧3∧4. Here the question is not not only if our results are sound but if the very methods we used to come up with those results are sufficiently trustworthy. Because any extraordinary conclusions that are implied by the conjunction of various beliefs might outweigh the benefit of each belief if the overall conclusion is just slightly wrong.

Not enough empirical evidence

Don’t get me wrong, I think that there sure are convincing arguments in favor of risks from AI. But do arguments suffice? Nobody is an expert when it comes to intelligence. My problem is that I fear that some convincing blog posts written in natural language are simply not enough.

Just imagine that all there was to climate change was someone who never studied the climate but instead wrote some essays about how it might be physical possible for humans to cause a global warming. If the same person then goes on to make further inferences based on the implications of those speculations, am I going to tell everyone to stop emitting CO2 because of that? Hardly!

Or imagine that all there was to the possibility of asteroid strikes was someone who argued that there might be big chunks of rocks out there which might fall down on our heads and kill us all, inductively based on the fact that the Earth and the moon are also a big rocks. Would I be willing to launch a billion dollar asteroid deflection program solely based on such speculations? I don’t think so.

Luckily, in both cases, we got a lot more than some convincing arguments in support of those risks.

Another example: If there were no studies about the safety of high energy physics experiments then I might assign a 20% chance of a powerful particle accelerator destroying the universe based on some convincing arguments put forth on a blog by someone who never studied high energy physics. We know that such an estimate would be wrong by many orders of magnitude. Yet the reason for being wrong would largely be a result of my inability to make correct probability estimates, the result of vagueness or a failure of the methods I employed to come up with those estimates. The reason for being wrong by many orders of magnitude would have nothing to do with the arguments in favor of the risks, as they might very well be sound given my epistemic state and the prevalent uncertainty.

I believe that mere arguments in favor of one risk do not suffice to neglect other risks that are supported by other kinds of evidence. I believe that logical implications of sound arguments should not reach out indefinitely and thereby outweigh other risks whose implications are fortified by empirical evidence. Sound arguments, predictions, speculations and their logical implications are enough to demand further attention and research, but not much more.

Logical implications

Artificial general intelligence is already an inference made from what we currently believe to be true, going a step further and drawing further inferences from previous speculations, e.g. explosive recursive self-improvement, is in my opinion a very shaky business.

What would happen if we were going to let logical implications of vast utilities outweigh other concrete near-term problems that are based on empirical evidence? Insignificant inferences might exhibit hyperbolic growth in utility: 1.) There is no minimum amount of empirical evidence necessary to extrapolate the expected utility of an outcome. 2.) The extrapolation of counterfactual alternatives is unbounded, logical implications can reach out indefinitely without ever requiring new empirical evidence.

Hidden disagreement

All of the above hints at a general problem that is the reason for why I think that discussions between people associated with SI, its critics and those who try to evaluate SI, won’t lead anywhere. Those discussions miss the underlying reason for most of the superficial disagreement about risks from AI, namely that there is no disagreement about risks from AI in and of itself.

There are a few people who disagree about the possibility of AGI in general, but I don’t want to touch on that subject in this post. I am trying to highlight the disagreement between SI and people who accept the notion of artificial general intelligence. With regard to those who are not skeptical of AGI the problem becomes more obvious when you turn your attention to people like John Baez organisations like GiveWell. Most people would sooner question their grasp of “rationality” than give five dollars to a charity that tries to mitigate risks from AI because their calculations claim it was “rational” (those who have read the article by Eliezer Yudkowsky on Pascal’s Mugging know that I used a statement from that post and slightly rephrased it). The disagreement all comes down to a general averseness to options that have a low probability of being factual, even given that the stakes are high.

Nobody is so far able to beat arguments that bear resemblance to Pascal’s Mugging. At least not by showing that it is irrational to give in from the perspective of a utility maximizer. One can only reject it based on a strong gut feeling that something is wrong. And I think that is what many people are unknowingly doing when they argue against SI or risks from AI. They are signaling that they are unable to take such risks into account. What most people mean when they doubt the reputation of people who claim that risks from AI need to be taken seriously, or who say that AGI might be far off, what those people mean is that risks from AI are too vague to be taken into account at this point, that nobody knows enough to make predictions about the topic right now.

When GiveWell, a charity evaluation service, interviewed SI (PDF), they hinted at the possibility that one could consider SI to be a sort of Pascal’s Mugging:

GiveWell: OK. Well that’s where I stand – I accept a lot of the controversial premises of your mission, but I’m a pretty long way from sold that you have the right team or the right approach. Now some have argued to me that I don’t need to be sold – that even at an infinitesimal probability of success, your project is worthwhile. I see that as a Pascal’s Mugging and don’t accept it; I wouldn’t endorse your project unless it passed the basic hurdles of credibility and workable approach as well as potentially astronomically beneficial goal.

This shows that lot of people do not doubt the possibility of risks from AI but are simply not sure if they should really concentrate their efforts on such vague possibilities.

Technically, from the standpoint of maximizing expected utility, given the absence of other existential risks, the answer might very well be yes. But even though we believe to understand this technical viewpoint of rationality very well in principle, it does also lead to problems such as Pascal’s Mugging. But it doesn’t take a true Pascal’s Mugging scenario to make people feel deeply uncomfortable with what Bayes’ Theorem, the expected utility formula, and Solomonoff induction seem to suggest one should do.

Again, we currently have no rational way to reject arguments that are framed as predictions of worst case scenarios that need to be taken seriously even given a low probability of their occurrence due to the scale of negative consequences associated with them. Many people are nonetheless reluctant to accept this line of reasoning without further evidence supporting the strong claims and request for money made by organisations such as SI.

Here is what mathematician and climate activist John Baez has to say:

Of course, anyone associated with Less Wrong would ask if I’m really maximizing expected utility. Couldn’t a contribution to some place like the Singularity Institute of Artificial Intelligence, despite a lower chance of doing good, actually have a chance to do so much more good that it’d pay to send the cash there instead?

And I’d have to say:

1) Yes, there probably are such places, but it would take me a while to find the one that I trusted, and I haven’t put in the work. When you’re risk-averse and limited in the time you have to make decisions, you tend to put off weighing options that have a very low chance of success but a very high return if they succeed. This is sensible so I don’t feel bad about it.

2) Just to amplify point 1) a bit: you shouldn’t always maximize expected utility if you only live once. Expected values — in other words, averages — are very important when you make the same small bet over and over again. When the stakes get higher and you aren’t in a position to repeat the bet over and over, it may be wise to be risk averse.

3) If you let me put the \$100,000 into my retirement account instead of a charity, that’s what I’d do, and I wouldn’t even feel guilty about it. I actually think that the increased security would free me up to do more risky but potentially very good things!

All this shows that there seems to be a fundamental problem with the formalized version of rationality. The problem might be human nature itself, that some people are unable to accept what they should do if they want to maximize their expected utility. Or we are missing something else and our theories are flawed. Either way, to solve this problem we need to research those issues and thereby increase the confidence in the very methods used to decide what to do about risks from AI, or to increase the confidence in risks from AI directly, enough to make it look like a sensible option, a concrete and discernable problem that needs to be solved.

Many people perceive the whole world to be at stake, either due to climate change, war or engineered pathogens. Telling them about something like risks from AI, even though nobody seems to have any idea about the nature of intelligence, let alone general intelligence or the possibility of recursive self-improvement, seems like just another problem, one that is too vague to outweigh all the other risks. Most people feel like having a gun pointed to their heads, telling them about superhuman monsters that might turn them into paperclips then needs some really good arguments to outweigh the combined risk of all other problems.

But there are many other problems with risks from AI. To give a hint at just one example: if there was a risk that might kill us with a probability of .7 and another risk with .1 while our chance to solve the first one was .0001 and the second one .1, which one should we focus on? In other words, our decision to mitigate a certain risk should not only be focused on the probability of its occurence but also on the probability of success in solving it. But as I have written above I believe that the most pressing issue is to increase the confidence into making decisions under extreme uncertainty or to reduce the uncerainty itself.