Eliezer Yudkowsky

You are currently browsing articles tagged Eliezer Yudkowsky.

As we know,
There are known knowns.
There are things
We know we know.
We also know
There are known unknowns.
That is to say
We know there are some things
We do not know.
But there are also unknown unknowns,
The ones we don’t know
We don’t know.

— Donald Rumsfeld, Feb. 12, 2002, Department of Defense news briefing

Intelligence, a cornucopia?

It seems to me that those who believe into the possibility of catastrophic risks from artificial intelligence act on the unquestioned assumption that intelligence is kind of a black box, a cornucopia that can sprout an abundance of novelty. But this implicitly assumes that if you increase intelligence you also decrease the distance between discoveries.

Intelligence is no solution in itself, it is merely an effective searchlight for unknown unknowns and who knows that the brightness of the light increases proportionally with the distance between unknown unknowns? To enable an intelligence explosion the light would have to reach out much farther with each increase in intelligence than the increase of the distance between unknown unknowns. I just don’t see that to be a reasonable assumption.

Intelligence amplification, is it worth it?

It seems that if you increase intelligence you also increase the computational cost of its further improvement and the distance to the discovery of some unknown unknown that could enable another quantum leap. It seems that you need to apply a lot more energy to get a bit more complexity.

If any increase in intelligence is vastly outweighed by its computational cost and the expenditure of time needed to discover it then it might not be instrumental for a perfectly rational agent (such as an artificial general intelligence), as imagined by game theorists, to increase its intelligence as opposed to using its existing intelligence to pursue its terminal goals directly or to invest its given resources to acquire other means of self-improvement, e.g. more efficient sensors.

What evidence do we have that the payoff of intelligent, goal-oriented experimentation yields enormous advantages (enough to enable an intelligence explosion) over evolutionary discovery relative to its cost?

We simply don’t know if intelligence is instrumental or quickly hits diminishing returns.

Can intelligence be effectively applied to itself at all? How do we know that any given level of intelligence is capable of handling its own complexity efficiently? Many humans are not even capable of handling the complexity of the brain of a worm.

Humans and the importance of discovery

There is a significant difference between intelligence and evolution if you apply intelligence to the improvement of evolutionary designs:

  • Intelligence is goal-oriented.
  • Intelligence can think ahead.
  • Intelligence can jump fitness gaps.
  • Intelligence can engage in direct experimentation.
  • Intelligence can observe and incorporate solutions of other optimizing agents.

But when it comes to unknown unknowns, what difference is there between intelligence and evolution? The critical similarity is that both rely on dumb luck when it comes to genuine novelty. And where else but when it comes to the dramatic improvement of intelligence itself does it take the discovery of novel unknown unknowns?

We have no idea about the nature of discovery and its importance when it comes to what is necessary to reach a level of intelligence above our own, by ourselves. How much of what we know was actually the result of people thinking quantitatively and attending to scope, probability, and marginal impacts? How much of what we know today is the result of dumb luck versus goal-oriented, intelligent problem solving?

Our “irrationality” and the patchwork-architecture of the human brain might constitute an actual feature. The noisiness and patchwork architecture of the human brain might play a significant role in the discovery of unknown unknowns because it allows us to become distracted, to leave the path of evidence based exploration.

A lot of discoveries were made by people who were not explicitly trying to maximizing expected utility. A lot of progress is due to luck, in the form of the discovery of unknown unknowns.

A basic argument in support of risks from superhuman intelligence is that we don’t know what it could possible come up with. That is also why it is called it a “Singularity“. But why does nobody ask how a superhuman intelligence knows what it could possible come up with?

It is not intelligence in and of itself that allows humans to accomplish great feats. Even people like Einstein, geniuses who were apparently able to come up with great insights on their own, were simply lucky to be born into the right circumstances, the time was ripe for great discoveries, thanks to previous discoveries of unknown unknowns.

Evolution versus Intelligence

It is argued that the mind-design space must be large if evolution could stumble upon general intelligence and that there are low-hanging fruits that are much more efficient at general intelligence than humans are, evolution simply went with the first that came along. It is further argued that evolution is not limitlessly creative, each step must increase the fitness of its host, and that therefore there are artificial mind designs that can do what no product of natural selection could accomplish.

I agree with the above, yet given all of the apparent disadvantages of the blind idiot God, evolution was able to come up with altruism, something that works two levels above the individual and one level above society. So far we haven’t been able to show such ingenuity by incorporating successes that are not evident from an individual or even societal position.

The example of altruism provides evidence that intelligence isn’t many levels above evolution. Therefore the crucial question is, how great is the performance advantage? Is it large enough to justify the conclusion that the probability of an intelligence explosion is easily larger than 1%? I don’t think so. To answer this definitively we would have to fathom the significance of the discovery (“random mutations”) of unknown unknowns in the dramatic amplification of intelligence versus the invention (goal-oriented “research and development”) of an improvement within known conceptual bounds.

Another example is flight. Artificial flight is not even close to the energy efficiency and maneuverability of birds or insects. We didn’t went straight from no artificial flight towards flight that is generally superior to the natural flight that is an effect of biological evolution.

Dragonfly

Take for example a dragonfly. Even if we were handed the design for a perfect artificial dragonfly, minus the design for the flight of a dragonfly, we wouldn’t be able to build a dragonfly that can take over the world of dragonflies, all else equal, by means of superior flight characteristics.

It is true that a Harpy Eagle can lift more than three-quarters of its body weight while the Boeing 747 Large Cargo Freighter has a maximum take-off weight of almost double its operating empty weight (I suspect that insects can do better). My whole point is that we never reached artificial flight that is strongly above the level of natural flight. An eagle can after all catch its cargo under various circumstances like the slope of a mountain or from beneath the sea, thanks to its superior maneuverability.

Humans are biased and irrational

It is obviously true that our expert systems are better than we are at their narrow range of expertise. But that expert systems are better at certain tasks does not imply that you can effectively and efficiently combine them into a coherent agency.

The noisiness of the human brain might be one of the important features that allows it to exhibit general intelligence. Yet the same noise might be the reason that each task a human can accomplish is not put into execution with maximal efficiency. An expert system that features a single stand-alone ability is able to reach the unique equilibrium for that ability. Whereas systems that have not fully relaxed to equilibrium feature the necessary characteristics that are required to exhibit general intelligence. In this sense a decrease in efficiency is a side-effect of general intelligence. If you externalize a certain ability into a coherent framework of agency, you decrease its efficiency dramatically. That is the difference between a tool and the ability of the agent that uses the tool.

In the above sense, our tendency to be biased and act irrationally might partly be a trade off between plasticity, efficiency and the necessity of goal-stability.

Embodied cognition and the environment

Another problem is that general intelligence is largely a result of an interaction between an agent and its environment. It might be in principle possible to arrive at various capabilities by means of induction, but it is only a theoretical possibility given unlimited computational resources. To achieve real world efficiency you need to rely on slow environmental feedback and make decision under uncertainty.

AIXI is often quoted as a proof of concept that it is possible for a simple algorithm to improve itself to such an extent that it could in principle reach superhuman intelligence. AIXI proves that there is a general theory of intelligence. But there is a minor problem, AIXI is as far from real world human-level general intelligence as an abstract notion of a Turing machine with an infinite tape is from a supercomputer with the computational capacity of the human brain. An abstract notion of intelligence doesn’t get you anywhere in terms of real-world general intelligence. Just as you won’t be able to upload yourself to a non-biological substrate because you showed that in some abstract sense you can simulate every physical process.

Just imagine you emulated a grown up human mind and it wanted to become a pick up artist, how would it do that with an Internet connection? It would need some sort of avatar, at least, and then wait for the environment to provide a lot of feedback.

Therefore even if we’re talking about the emulation of a grown up mind, it will be really hard to acquire some capabilities. Then how is the emulation of a human toddler going to acquire those skills? Even worse, how is some sort of abstract AGI going to do it that misses all of the hard coded capabilities of a human toddler?

Can we even attempt to imagine what is wrong about a boxed emulation of a human toddler, that makes it unable to become a master of social engineering in a very short time?

Can we imagine what is missing that would enable one of the existing expert systems to quickly evolve vastly superhuman capabilities in its narrow area of expertise? Why haven’t we seen a learning algorithm teaching itself chess intelligence starting with nothing but the rules?

In a sense an intelligent agent is similar to a stone rolling down a hill, both are moving towards a sort of equilibrium. The difference is that intelligence is following more complex trajectories as its ability to read and respond to environmental cues is vastly greater than that of a stone. Yet intelligent or not, the environment in which an agent is embedded plays a crucial role. There exist a fundamental dependency on unintelligent processes. Our environment is structured in such a way that we use information within it as an extension of our minds. The environment enables us to learn and improve our predictions by providing a testbed and a constant stream of data.

Necessary resources for an intelligence explosion

If artificial general intelligence is unable to seize the resources necessary to undergo explosive recursive self-improvement then the ability and cognitive flexibility of superhuman intelligence in and of itself, as characteristics alone, would have to be sufficient to self-modify its way up to massive superhuman intelligence within a very short time.

Without advanced real-world nanotechnology it will be considerable more difficult for an AGI to undergo quick self-improvement. It will have to make use of existing infrastructure, e.g. buy stocks of chip manufactures and get them to create more or better CPU’s. It will have to rely on puny humans for a lot of tasks. It won’t be able to create new computational substrate without the whole economy of the world supporting it. It won’t be able to create an army of robot drones overnight without it either.

Doing so it would have to make use of considerable amounts of social engineering without its creators noticing it. But, more importantly, it will have to make use of its existing intelligence to do all of that. The AGI would have to acquire new resources slowly, as it couldn’t just self-improve to come up with faster and more efficient solutions. In other words, self-improvement would demand resources. The AGI could not profit from its ability to self-improve regarding the necessary acquisition of resources to be able to self-improve in the first place.

Therefore the absence of advanced nanotechnology constitutes an immense blow to the possibility of explosive recursive self-improvement and risks from AI in general.

One might argue that an AGI will solve nanotechnology on its own and find some way to trick humans into manufacturing a molecular assembler and grant it access to it. But this might be very difficult.

There is a strong interdependence of resources and manufacturers. The AGI won’t be able to simply trick some humans to build a high-end factory to create computational substrate, let alone a molecular assembler. People will ask questions and shortly after get suspicious. Remember, it won’t be able to coordinate a world-conspiracy, it hasn’t been able to self-improve to that point yet because it is still trying to acquire enough resources, which it has to do the hard way without nanotech.

Anyhow, you’d probably need a brain the size of the moon to effectively run and coordinate a whole world of irrational humans by intercepting their communications and altering them on the fly without anyone freaking out.

People associated with the SIAI would at this point claim that if the AI can’t make use of nanotechnology it might make use of something we haven’t even thought about. But what, magic?

Artificial general intelligence, a single break-through?

Another point to consider when talking about risks from AI is how quickly the invention of artificial general intelligence will take place. What evidence do we have that there is some principle that, once discovered, allows us to grow superhuman intelligence overnight?

If the development of AGI takes place slowly, a gradual and controllable development, we might be able to learn from small-scale mistakes while having to face other risks in the meantime. This might for example be the case if intelligence can not be captured by a discrete algorithm, or is modular, and therefore never allow us to reach a point where we can suddenly build the smartest thing ever that does just extend itself indefinitely.

To me it doesn’t look like that we will come up with artificial general intelligence quickly, but rather that we will have to painstakingly optimize our expert systems step by step over long periods of times.

Paperclip maximizers

It is claimed that an artificial general intelligence might wipe us out inadvertently while undergoing explosive recursive self-improvement to more effectively pursue its terminal goals. I think that it is unlikely that most AI designs will not hold.

I agree with the argument that any AGI that isn’t made to care about humans won’t care about humans. But I also think that the same argument applies for spatio-temporal scope boundaries and resource limits. Even if the AGI is not told to hold, e.g. compute as many digits of Pi as possible, I consider it an far-fetched assumption that any AGI intrinsically cares to take over the universe as fast as possible to compute as many digits of Pi as possible. Sure, if all of that are presuppositions then it will happen, but I don’t see that most of all AGI designs are like that. Most that have the potential for superhuman intelligence, but who are given simple goals, will in my opinion just bob up and down as slowly as possible.

Complex goals need complex optimization parameters (the design specifications of the subject of the optimization process against which it will measure its success of self-improvement).

Even the creation of paperclips is a much more complex goal than telling an AI to compute as many digits of Pi as possible.

For an AGI, that was designed to design paperclips, to pose an existential risk, its creators would have to be capable enough to enable it to take over the universe on its own, yet forget, or fail to, define time, space and energy bounds as part of its optimization parameters. Therefore, given the large amount of restrictions that are inevitably part of any advanced general intelligence, the nonhazardous subset of all possible outcomes might be much larger than that where the AGI works perfectly yet fails to hold before it could wreak havoc.

Fermi paradox

The Fermi paradox does allow for and provide the only conclusions and data we can analyze that amount to empirical criticism of concepts like that of a Paperclip maximizer and general risks from superhuman AI’s with non-human values without working directly on AGI to test those hypothesis ourselves.

If you accept the premise that life is not unique and special then one other technological civilisation in the observable universe should be sufficient to leave potentially observable traces of technological tinkering.

Due to the absence of any signs of intelligence out there, especially paper-clippers burning the cosmic commons, we might conclude that unfriendly AI could not be the most dangerous existential risk that we should worry about.

Summary

In principle we could build antimatter weapons capable of destroying worlds, but in practise it is much harder to accomplish.

There are many question marks when it comes to the possibility of superhuman intelligence, and many more about the possibility of recursive self-improvement. Most of the arguments in favor of those possibilities solely derive their appeal from being vague.

Further reading


Tags: , , , ,

It seems to me that all the ways in which we disagree have more to do with philosophy (how to quantify uncertainty; how to deal with conjunctions; how to act in consideration of low probabilities) [...] we are not dealing with well-defined or -quantified probabilities. Any prediction can be rephrased so that it sounds like the product of indefinitely many conjunctions. It seems that I see the “SIAI’s work is useful scenario” as requiring the conjunction of a large number of questionable things [...]

— Holden Karnofsky, 6/24/11 (GiveWell interview with major SIAI donor Jaan Tallinn, PDF)

Disjunctive arguments

People associated with the Singularity Institute for Artificial Intelligence (SIAI) like to claim that the case for risks from AI is supported by years worth of disjunctive lines of reasoning. This basically means that there are many reasons to believe that humanity is likely to be wiped out as a result of artificial general intelligence. More precisely it means that not all of the arguments supporting that possibility need to be true, even if all but one are false risks from AI are to be taken seriously.

The idea of disjunctive arguments is formalized by what is called a logical disjunction. Consider two declarative sentences, A and B. The truth of the conclusion (or output) that follows from the sentences A and B does depend on the truth of A and B. In the case of a logical disjunction the conclusion of A and B is only false if both A and B are false, otherwise it is true. Truth values are usually denoted by 0 for false and 1 for true. A disjunction of declarative sentences is denoted by OR or ∨ as an infix operator. For example, (A(0)∨B(1))(1), or in other words, if statement A is false and B is true then what follows is still true because statement B is sufficient to preserve the truth of the overall conclusion.

Generally there is no problem with disjunctive lines of reasoning as long as the conclusion itself is sound and therefore in principle possible yet in demand of at least one of several causative factors to become actual. I don’t perceive this to be the case for risks from AI. I agree that there are many ways in which artificial general intelligence (AGI) could be dangerous, but only if I accept several presuppositions regarding AGI that I actually dispute.

By presuppositions I mean requirements that need to be true simultaneously (in conjunction). A logical conjunction is only true if all of its operands are true. In other words, the a conclusion might require all of the arguments leading up to it to be true, otherwise it is false. A conjunction is denoted by AND or ∧.

Now consider the following prediction: <Mary is going to buy one of thousands of products in the supermarket.>

The above prediction can be framed as a disjunction: Mary is going to buy one of thousands of products in the supermarket, 1.) if she is hungry 2.) if she is thirsty 3.) if she needs a new coffee machine. Only one of the 3 given possible arguments need to be true in order to leave the overall conclusion to be true, that Mary is going shopping. Or so it seems.

The same prediction can be framed as a conjunction: Mary is going to buy one of thousands of products in the supermarket 1.) if she has money 2.) if she has some needs 3.) if the supermarket is open. All of the 3 given factors need to be true in order to render the overall conclusion to be true.

That a prediction is framed to be disjunctive does not speak in favor of the possibility in and of itself. I agree that it is likely that Mary is going to visit the supermarket if I accept the hidden presuppositions. But a prediction is only at most as probable as its basic requirements. In this particular case I don’t even know if Mary is a human or a dog, a factor that can influence the probability of the prediction dramatically.

The same is true for risks from AI. The basic argument in favor of risks from AI is that of an intelligence explosion, that intelligence can be applied to itself in an iterative process leading to ever greater levels of intelligence. In short, artificial general intelligence will undergo explosive recursive self-improvement.

Hidden complexity

Explosive recursive self-improvement is one of the presuppositions for the possibility of risks from AI. The problem is that this and other presuppositions are largely ignored and left undefined. All of the disjunctive arguments put forth by the SIAI are trying to show that there are many causative factors that will result in the development of unfriendly artificial general intelligence. Only one of those factors needs to be true for us to be wiped out by AGI. But the whole scenario is at most as probable as the assumption hidden in the words <artificial general intelligence> and <explosive recursive self-improvement>.

<Artificial General Intelligence> and <Explosive Recursive Self-improvement> might appear to be relatively simple and appealing concepts. But most of this superficial simplicity is a result of the vagueness of natural language descriptions. Reducing the vagueness of those concepts by being more specific, or by coming up with technical definitions of each of the words they are made up of, reveals the hidden complexity that is comprised in the vagueness of the terms.

If we were going to define those concepts and each of its terms we would end up with a lot of additional concepts made up of other words or terms. Most of those additional concepts will demand explanations of their own made up of further speculations. If we are precise then any declarative sentence (P#) (all of the terms) used in the final description will have to be true simultaneously (P#∧P#). And this does reveal the true complexity of all hidden presuppositions and thereby influence the overall probability, P(risks from AI) = P(P1∧P2∧P3∧P4∧P5∧P6∧…). That is because the conclusion of an argument that is made up of a lot of statements (terms) that can be false is more unlikely to be true since complex arguments can fail in a lot of different ways. You need to support each part of the argument that can be true or false and you can therefore fail to support one or more of its parts, which in turn will render the overall conclusion false.

To summarize: If we tried to pin down a concept like <Explosive Recursive Self-Improvement> we would end up with requirements that are strongly conjunctive.

Making numerical probability estimates

But even if the SIAI was going to thoroughly define those concepts, there is still more to the probability of risks from AI than the underlying presuppositions and causative factors. We also have to integrate our uncertainty about the very methods we used to come up with those concepts, definitions and our ability to make correct predictions about the future and integrate all of it into our overall probability estimates.

Take for example the following contrived quote:

We have to take over the universe to save it by making the seed of an artificial general intelligence, that is undergoing explosive recursive self-improvement, extrapolate the coherent volition of humanity, while acausally trading with other superhuman intelligences across the multiverse.

Although contrived, the above quote does only comprise actual beliefs hold by people associated with the SIAI. All of those beliefs might seem somewhat plausible inferences and logical implications of speculations and state of the art or bleeding edge knowledge of various fields. But should we base real-life decisions on those ideas, should we take those ideas seriously? Should we take into account conclusions whose truth value does depend on the conjunction of those ideas? And is it wise to make further inferences on those speculations?

Let’s take a closer look at the necessary top-level presuppositions to take the above quote seriously:

  1. The many-worlds interpretation
  2. Belief in the Implied Invisible
  3. Timeless Decision theory
  4. Intelligence explosion

1: Within the lesswrong/SIAI community the many-worlds interpretation of quantum mechanics is proclaimed to be the rational choice of all available interpretations. How to arrive at this conclusion is supposedly also a good exercise in refining the art of rationality.

2: P(Y|X) ≈ 1, then P(X∧Y) ≈ P(X)

In other words, logical implications do not have to pay rent in future anticipations.

3: “Decision theory is the study of principles and algorithms for making correct decisions—that is, decisions that allow an agent to achieve better outcomes with respect to its goals.”

4: “Intelligence explosion is the idea of a positive feedback loop in which an intelligence is making itself smarter, thus getting better at making itself even smarter. A strong version of this idea suggests that once the positive feedback starts to play a role, it will lead to a dramatic leap in capability very quickly.”

To be able to take the above quote seriously you have to assign a non-negligible probability to the truth of the conjunction of #1,2,3,4, 1∧2∧3∧4. Here the question is not not only if our results are sound but if the very methods we used to come up with those results are sufficiently trustworthy. Because any extraordinary conclusions that are implied by the conjunction of various beliefs might outweigh the benefit of each belief if the overall conclusion is just slightly wrong.

Not enough empirical evidence

Don’t get me wrong, I think that there sure are convincing arguments in favor of risks from AI. But do arguments suffice? Nobody is an expert when it comes to intelligence. My problem is that I fear that some convincing blog posts written in natural language are simply not enough.

Just imagine that all there was to climate change was someone who never studied the climate but instead wrote some essays about how it might be physical possible for humans to cause a global warming. If the same person then goes on to make further inferences based on the implications of those speculations, am I going to tell everyone to stop emitting CO2 because of that? Hardly!

Or imagine that all there was to the possibility of asteroid strikes was someone who argued that there might be big chunks of rocks out there which might fall down on our heads and kill us all, inductively based on the fact that the Earth and the moon are also a big rocks. Would I be willing to launch a billion dollar asteroid deflection program solely based on such speculations? I don’t think so.

Luckily, in both cases, we got a lot more than some convincing arguments in support of those risks.

Another example: If there were no studies about the safety of high energy physics experiments then I might assign a 20% chance of a powerful particle accelerator destroying the universe based on some convincing arguments put forth on a blog by someone who never studied high energy physics. We know that such an estimate would be wrong by many orders of magnitude. Yet the reason for being wrong would largely be a result of my inability to make correct probability estimates, the result of vagueness or a failure of the methods I employed to come up with those estimates. The reason for being wrong by many orders of magnitude would have nothing to do with the arguments in favor of the risks, as they might very well be sound given my epistemic sate and the prevalent uncertainty.

I believe that mere arguments in favor of one risk do not suffice to neglect other risks that are supported by other kinds of evidence. I believe that logical implications of sound arguments should not reach out indefinitely and thereby outweigh other risks whose implications are fortified by empirical evidence. Sound arguments, predictions, speculations and their logical implications are enough to demand further attention and research, but not much more.

Logical implications

Artificial general intelligence is already an inference made from what we currently believe to be true, going a step further and drawing further inferences from previous speculations, e.g. explosive recursive self-improvement, is in my opinion a very shaky business.

What would happen if we were going to let logical implications of vast utilities outweigh other concrete near-term problems that are based on empirical evidence? Insignificant inferences might exhibit hyperbolic growth in utility: 1.) There is no minimum amount of empirical evidence necessary to extrapolate the expected utility of an outcome. 2.) The extrapolation of counterfactual alternatives is unbounded, logical implications can reach out indefinitely without ever requiring new empirical evidence.

Hidden disagreement

All of the above hints at a general problem that is the reason for why I think that discussions between people associated with the SIAI, its critics and those who try to evaluate the SIAI, won’t lead anywhere. Those discussions miss the underlying reason for most of the superficial disagreement about risks from AI, namely that there is no disagreement about risks from AI in and of itself.

There are a few people who disagree about the possibility of AGI in general, but I don’t want to touch on that subject in this post. I am trying to highlight the disagreement between the SIAI and people who accept the notion of artificial general intelligence. With regard to those who are not skeptical of AGI the problem becomes more obvious when you turn your attention to people like John Baez organisations like GiveWell. Most people would sooner question their grasp of “rationality” than give five dollars to a charity that tries to mitigate risks from AI because their calculations claim it was “rational” (those who have read the article by Eliezer Yudkowsky on Pascal’s Mugging know that I used a statement from that post and slightly rephrased it). The disagreement all comes down to a general averseness to options that have a low probability of being factual, even given that the stakes are high.

Nobody is so far able to beat arguments that bear resemblance to Pascal’s Mugging. At least not by showing that it is irrational to give in from the perspective of a utility maximizer. One can only reject it based on a strong gut feeling that something is wrong. And I think that is what many people are unknowingly doing when they argue against the SIAI or risks from AI. They are signaling that they are unable to take such risks into account. What most people mean when they doubt the reputation of people who claim that risks from AI need to be taken seriously, or who say that AGI might be far off, what those people mean is that risks from AI are too vague to be taken into account at this point, that nobody knows enough to make predictions about the topic right now.

When GiveWell, a charity evaluation service, interviewed the SIAI (PDF), they hinted at the possibility that one could consider the SIAI to be a sort of Pascal’s Mugging:

GiveWell: OK. Well that’s where I stand – I accept a lot of the controversial premises of your mission, but I’m a pretty long way from sold that you have the right team or the right approach. Now some have argued to me that I don’t need to be sold – that even at an infinitesimal probability of success, your project is worthwhile. I see that as a Pascal’s Mugging and don’t accept it; I wouldn’t endorse your project unless it passed the basic hurdles of credibility and workable approach as well as potentially astronomically beneficial goal.

This shows that lot of people do not doubt the possibility of risks from AI but are simply not sure if they should really concentrate their efforts on such vague possibilities.

Technically, from the standpoint of maximizing expected utility, given the absence of other existential risks, the answer might very well be yes. But even though we believe to understand this technical viewpoint of rationality very well in principle, it does also lead to problems such as Pascal’s Mugging. But it doesn’t take a true Pascal’s Mugging scenario to make people feel deeply uncomfortable with what Bayes’ Theorem, the expected utility formula, and Solomonoff induction seem to suggest one should do.

Again, we currently have no rational way to reject arguments that are framed as predictions of worst case scenarios that need to be taken seriously even given a low probability of their occurrence due to the scale of negative consequences associated with them. Many people are nonetheless reluctant to accept this line of reasoning without further evidence supporting the strong claims and request for money made by organisations such as the SIAI.

Here is what mathematician and climate activist John Baez has to say:

Of course, anyone associated with Less Wrong would ask if I’m really maximizing expected utility. Couldn’t a contribution to some place like the Singularity Institute of Artificial Intelligence, despite a lower chance of doing good, actually have a chance to do so much more good that it’d pay to send the cash there instead?

And I’d have to say:

1) Yes, there probably are such places, but it would take me a while to find the one that I trusted, and I haven’t put in the work. When you’re risk-averse and limited in the time you have to make decisions, you tend to put off weighing options that have a very low chance of success but a very high return if they succeed. This is sensible so I don’t feel bad about it.

2) Just to amplify point 1) a bit: you shouldn’t always maximize expected utility if you only live once. Expected values — in other words, averages — are very important when you make the same small bet over and over again. When the stakes get higher and you aren’t in a position to repeat the bet over and over, it may be wise to be risk averse.

3) If you let me put the $100,000 into my retirement account instead of a charity, that’s what I’d do, and I wouldn’t even feel guilty about it. I actually think that the increased security would free me up to do more risky but potentially very good things!

All this shows that there seems to be a fundamental problem with the formalized version of rationality. The problem might be human nature itself, that some people are unable to accept what they should do if they want to maximize their expected utility. Or we are missing something else and our theories are flawed. Either way, to solve this problem we need to research those issues and thereby increase the confidence in the very methods used to decide what to do about risks from AI, or to increase the confidence in risks from AI directly, enough to make it look like a sensible option, a concrete and discernable problem that needs to be solved.

Many people perceive the whole world to be at stake, either due to climate change, war or engineered pathogens. Telling them about something like risks from AI, even though nobody seems to have any idea about the nature of intelligence, let alone general intelligence or the possibility of recursive self-improvement, seems like just another problem, one that is too vague to outweigh all the other risks. Most people feel like having a gun pointed to their heads, telling them about superhuman monsters that might turn them into paperclips then needs some really good arguments to outweigh the combined risk of all other problems.

But there are many other problems with risks from AI. To give a hint at just one example: if there was a risk that might kill us with a probability of .7 and another risk with .1 while our chance to solve the first one was .0001 and the second one .1, which one should we focus on? In other words, our decision to mitigate a certain risk should not only be focused on the probability of its occurence but also on the probability of success in solving it. But as I have written above I believe that the most pressing issue is to increase the confidence into making decisions under extreme uncertainty or to reduce the uncerainty itself.


Tags: , , , , ,

The term “Mind Projection Fallacy” was coined by the late great Bayesian Master, E. T. Jaynes, as part of his long and hard-fought battle against the accursed frequentists.  Jaynes was of the opinion that probabilities were in the mind, not in the environment – that probabilities express ignorance, states of partial information; and if I am ignorant of a phenomenon, that is a fact about my state of mind, not a fact about the phenomenon.

I remember (dimly, as human memories go) the first time I self-identified as a “Bayesian”. Someone had just asked a malformed version of an old probability puzzle…

You’ve probably seen the word ‘Bayesian’ used a lot on this site, but may be a bit uncertain of what exactly we mean by that.

Bayes’ theorem was the subject of a detailed article. The essay is good, but over 15,000 words long — here’s the condensed version for Bayesian newcomers like myself.

Bayes’ Theorem for the curious and bewildered; an excruciatingly gentle introduction.

Eliezer Yudkowsky’s T-Shirt

This post is elementary: it introduces a simple method of visualizing Bayesian calculations. In my defense, we’ve had other elementary posts before, and they’ve been found useful; plus, I’d really like this to be online somewhere, and it might as well be here.

Everyday use of a mathematical concept.

I recently came up with what I think is an intuitive way to explain Bayes’ Theorem…

Bayes' theorem

A law of probability that describes the proper way to incorporate new evidence into prior probabilities to form an updated probability estimate. Bayesian rationality takes its name from this theorem, as it is regarded as the foundation of consistent rational reasoning under uncertainty. A.k.a. “Bayes’s Theorem” or “Bayes’s Rule”.

Eliezer Yudkowsky is on bloggingheads.tv with the statistician Andrew Gelman.

Several different points of fascination about Bayes…

When looking further, there is however a whole crowd on the blogs that seems to see more in Bayes’s theorem than a mere probability inversion…

Bayesian statistics is a system for describing epistemological uncertainty using the mathematical language of probability.
Bayesian probability is one of the most popular interpretations of the concept of probability.

Edwin T. Jaynes was one of the first people to realize that probability theory, as originated by Laplace, is a generalization of Aristotelian logic that reduces to deductive logic in the special case that our hypotheses are either true or false. This web site has been established to help promote this interpretation of probability theory by distributing articles, books and related material. As Ed Jaynes originated this interpretation of probability theory we have a large selection of his articles, as well as articles by a number of other people who use probability theory in this way…

Bayesian statistics is so closely linked with induction that one often hears it called “Bayesian induction.” What could be more inductive than taking a prior, gathering data, updating the prior with Bayes Law, and limiting to the true distribution of some parameter?

Gelman (of the popular statistics blog) and Shalizi point that, in practice, Bayesian statistics should actually be seen as Popper-style hypothesis-based deduction. The problem is intricately linked to the “taking a prior” above.

Or, how to recognize Bayes’ theorem when you meet one making small talk at a cocktail party.

Still, I’m sure Blogger won’t mind me using their resources instead. The basic idea is that there’s a distinction between true values x and measured values y. You start off with a prior probability distribution over the true values. You then have a likelihood function, which gives you the probability P(y|x) of measuring any value y given a hypothetical true value x.

In other words, What is so special about starting with a human-generated hypothesis? Bayesian methods suggest what I think is the right answer: To get from probabilistic evidence to the probability of something requires combining the evidence with a prior expectation, a “prior probability”, and human hypothesis generation enables this requirement to be ignored with considerable practical success.

Andrew Gelman recently responded to a commenter on the Yudkowsky/Gelman diavlog; the commenter complained that Bayesian statistics were too subjective and lacked rigor.  I shall explain why this is unbelievably ironic…

Maybe this kind of Bayesian method for “proving the null” could be used to achieve a better balance.

Bayesian brain is a term that is used to refer to the ability of the nervous system to operate in situations of uncertainty in a fashion that is close to the optimal prescribed by Bayesian statistics.


—————————————

P.S.

Expect this link collection to be permanently updated.

Please post a comment if you have something to add.


Tags: , , , , , , , , , ,

Get Adobe Flash playerPlugin by wpburn.com wordpress themes