Consider an agent A which assumes itself to make only correct decisions. Here an arbitrary decision is denoted d and correct is denoted C, where Cd is defined to be any decision (respectively set of decisions) maximizing expected utility according to an agent’s utility-function U. Therefore A assumes CA, where CA is the set of all decisions that A is capable of deciding that also belong to the set of all correct decisions Cd (∀d ∈ CA, d ∈ Cd).

Let one possible decision k be defined as ¬Cd (k := ¬Cd (¬Cd is true if decision d does not maximize the expected utility of agent A)).

If A ever decides k then this will falsify its assumption that it only makes correct decisions (CA) and hence prove itself to be incorrect (¬CA). But since A assumes itself to make only correct decisions it believes that it will never decide k. Therefore CA iff ¬k. Substituting ¬Cd for k yields CA iff (¬¬Cd iff Cd) (A is correct if and only if its decisions are correct).

Now assume that A decides k anyway (e.g. a cosmic ray causes a malfunction in its decision module). Since A assumes CA it follows that k must have been a correct decision (k → Ck). Substituting ¬Cd for k yields ¬Cd → C¬Cd, which is a contradiction, and in turn implies ¬CA (A is incorrect).


Tags: ,

In this post I try to fathom an informal definition of Self, the “essential qualities that constitute a person’s uniqueness”. I assume that the most important requirement for a definition of self is time-consistency. A reliable definition of identity needs to allow for time-consistent self-referencing, since any agent that is unable to identify itself over time will be prone to make inconsistent decisions.

Data Loss

Obviously most humans don’t want to die, but what does that mean? What is it that humans try to preserve when they sign up for Cryonics? It seems that an explanation must account and allow for some sort of data loss.

The Continuity of Consciousness

It can’t be about the continuity of consciousness as we would have to refuse general anesthesia due to the risk of “dying” and most of us will agree that there is something more important than the continuity of consciousness that makes us accept a general anesthesia when necessary.

Computation

If the continuity of consciousness isn’t the most important detail about the self then it very likely isn’t the continuity of computation either. Imagine that for some reason the process evoked when “we” act on our inputs under the control of an algorithm halts for a second and then continues otherwise unaffected, would we don’t mind to be alive ever after because we died when the computation halted? This doesn’t seem to be the case.

Static Algorithmic Descriptions

Although we are not partly software and partly hardware, we could, in theory, come up with an algorithmic description of the human machine, of our selfs. Might it be that algorithm that we care about? If we were to digitize our self we would end up with a description of our spatial parts, our self at a certain time. Yet we forget that all of us possess such an algorithmic description of our selfs and we’re already able back it up. It is our DNA.

Temporal Parts

Admittedly our DNA is the earliest version of our selfs, but if we don’t care about the temporal parts of our selfs but only about a static algorithmic description of a certain spatiotemporal position, then what’s wrong with that? It seems a lot, we stop caring about past reifications of our selfs, at some point our backups become obsolete and having to fall back on them would equal death. But what is it that we lost, what information is it that we value more than all of the previously mentioned possibilities? One might think that it must be our memories, the data that represents what we learnt and experienced. But even if this is the case, would it be a reasonable choice?

Indentity and Memory

Let’s just disregard the possibility that we often might not value our future selfs and so do not value our past selfs either for that we lost or gained important information, e.g. if we became religious or have been able to overcome religion.

If we had perfect memory and only ever improved upon our past knowledge and experiences we wouldn’t be able to do so for very long, at least not given our human body. The upper limit on the information that can be contained within a human body is 2.5072178×10^38 megabytes, if it was used as a perfect data storage. Given that we gather much more than 1 megabyte of information per year, it is foreseeable that if we equate our memories with our self we’ll die long before the heat death of the universe. We might overcome this by growing in size, by achieving a posthuman form, yet if we in turn also become much smarter we’ll also produce and gather more information. We are not alone either and the resources are limited. One way or the other we’ll die rather quickly.

Does this mean we shouldn’t even bother about the far future or is there maybe something else we value even more than our memories? After all we don’t really mind much if we forget what we have done a few years ago.

Time-Consistency and Self-Reference

It seems that there is something even more important than our causal history. I think that more than everything we care about our values and goals. Indeed, we value the preservation of our values. As long as we want the same we are the same. Our goal system seems to be the critical part of our implicit definition of self, that which we want to protect and preserve. Our values and goals seem to be the missing temporal parts that allow us to consistently refer to us, to identify our selfs at different spatiotempiral positions.

Using our values and goals as identifiers also resolves the problem of how we should treat copies of our self that are featuring alternating histories and memories, copies with different causal histories. Any agent that does feature a copy of our utility function ought to be incorporated into our decisions as an instance, as a reification of our selfs. We should identify with our utility-function regardless of its instantiation.

Stable Utility-Functions

To recapitulate, we can value our memories, the continuity of experience and even our DNA, but the only reliable marker for the self identity of goal-oriented agents seems to be a stable utility function. Rational agents with an identical utility function will to some extent converge to exhibit similar behavior and are therefore able to cooperate. We can more consistently identify with our values and goals than with our past and future memories, digitized backups or causal history.

But even if this is true there is one problem, humans might not exhibit goal-stability.


Tags: , , , , , ,

Morality is an objective property of a system that consists of a person that utters moral statements and the specific entity in, or feature of, the world that the statement identifies or denotes. Yet Morality can be explained in terms of lower level interactions. This does not contradict, systems can have properties that their parts alone do not.

Ethical statements

Let’s take a look at two ethical statements:

  1. It is morally wrong for Alice to lie to Bob.
  2. It is morally wrong for Bob to strangle Alice.

What do people really mean when they utter those statements? Let’s try to pin down the underlying reasons and motivations of the first statement by paraphrasing it:

1: Due to my genetically hard-coded intuitions about appropriate behavior within groups of primates, my upbringing, cultural influences, rational knowledge about the virtues of truth-telling and preferences involving the well-being of other people, I feel obliged to influence the intercourse between Alice and Bob in a way that persuades Alice to do what I want, without feeling inappropriately influenced by me, by signaling my objection to certain behaviors as an appeal to the order of higher authority.

But what is meant by an appeal to the order of “higher authority”? To make this more clear, let’s now take a look at a chat between hypothetical Bob and myself:

Alexander: I don’t want you to strangle Alice.

Bob: I don’t care what you want!

Alexander: Strangling Alice might have detrimental effects on your other preferences.

Bob: So? I don’t care, I assign infinite utility to world-states where Alice is dead!

Alexander:  But it is morally wrong to strangle Alice.

Bob: Hmm…I think you are right, I don’t want to be immoral!

What happened here? I have been trying to convince Bob not to kill Alice. In other words, I tried to get Bob to do what I want. I used three different methods:

  1. Accounting for third-party preferences.
  2. Weighing one preference against all other preferences.
  3. Evoking guilt.

Explanatory remarks to methods 1-3:

1: Primates don’t like to be readily controlled by other primates. To get them to do what you want you have to make them believe that they actually want to do it themselves.

2: Humans who are in a temporary rage often discount all long-term consequences of their decisions. To be persuasive it might take some subtle, non-obvious incentive.

3: Using moral language is really a form of coercive persuasion. Since when I say, “It is morally wrong to strangle Alice.”, I actually signal, “If you strangle Alice you will feel guilty.” It is a manipulative method that subtly influences Bob to say, “You are right, I don’t want to be immoral!”, when what he actually means is, “I don’t want to feel guilty!”

Method #3 works by making use of various cultural and otherwise present connotations carried by the label “morally wrong”, primarily by evoking negative emotions and the prospect of a loss of social reputation. The difference to methods #1,2 is that #3 does derive its authority from a complex (obscure) interrelationship of evolutionary, emotional, environmental and cultural factors. While method #1 asks Bob to be altruistic and #2 selfish, method #3 does posit a fuzzy imperative.

Further reading

‘Moral Ontology’ by Richard Carrier

Pluralistic Moral Reductionism

Trivers on Self-Deception

Ego syntonic thoughts and values

The limits of introspection

Homo Hypocritus Signals


Tags: , , ,

What if I’m wrong?

Sebastian Marshall asks, “What if I’m wrong?”:

What if you were really wrong? Like, not just the wrong course of action, but what if your whole idea of the setup and cause and effect and payoffs and long term consequences of your actions were flawed? What if you made a serious mistake somewhere in your evaluations, and you were going to get the opposite result of what you wanted? What if you got a horrific result?

[...]

What if your safe job is actually a trap?

What if your favorite food is making you fat and diabetic and killing you?

What if you’re slowly killing the person you’re trying to save? What if they’re slowly killing you?

What if getting your preferred politics turned your society and culture into an apocalyptic wasteland?

What if your favorite leisure activity is wrecking your mind, making you stupid, and holding you back from heights you can’t even imagine from where you’re at?

What if being “ultra-hardcore” at the gym is likely to cause injury and destroy your strength, flexibility, and health? What if resting more actually produced larger, safer gains?

The satisfaction of needs

Becoming less wrong is just one of your preferences and needs, as a human being you need to acknowledge and account for all your preferences and needs.

“What if I’m wrong?”

You have to draw the line where asking that question once more will make you never ask the question again. In other words, if you notice that you need to eat, drink or sleep then stop asking the question, because otherwise you won’t be able to ask it anymore. This also counts for pleasure and leisure, if you feel unhappy about not being able to play that new game then go play it until you feel satisfied. If you don’t do it, if you don’t play the game or watch that movie and continue to ask yourself if it is worth it, if it might be the wrong choice, then your unhappiness might turn into depression which in turn will make you reluctant or unable to ask that question anymore.

You can only do your best

What if I’m wrong about the above? I can only do my best.

Whatever intelligence is, it can’t be intelligent all the way down. It’s just dumb stuff at the bottom.
Andy Clark

We are fundamentally dependent on unintelligent processes and naive introspection. We do not plan when and how to think. We rely on an unconscious hierarchical decision procedure that decides to filter out most sensory data. Only what is deemed “important”, what is above a certain threshold, is forwarded far enough to reach conscious reflection. It would be stupid to allocate resources equally.

I, my brain and body, might be wrong to conclude that I need sleep. But I am not thinking about that possibility, not only because I’m a computationally bounded agent but also because thinking in and of itself is an activity that I might be wrong about, just like sleeping. All in all, everything taken into account, sleeping simply turned out to have the most weight right now.

But what if there are monsters under the bed? Then either I survive, learn from that incident and assign enough weight to the possibility of monsters hiding under my bed as to take it into account the next time, or I die and only those agents who “naturally” allocate enough resources to fighting monsters, before going to bed, will survive.

We can only do our best, which includes the allocation of resources to preemptive measures against black swan events.


Tags:

One of the fundamental premises on lesswrong.com is that a universal computing device can simulate every physical process and that we therefore should be able to reverse engineer the human brain as it is fundamentally computable. That is, intelligence and consciousness are substrate-neutral.

Substrate neutrality (not to be confused with substrate independence) is widely accepted to be factual, even self-evident, within computer science and transhumanist circles (i.e. the general science fiction, early adopter, programmer, technophile, nerd crowd). But this isn’t necessarily the case within the academic philosophy camp, which often leads to a lot of confusion and mutual disrespect.

Although I can’t tell that I understand either party, in this post I will attempt to rephrase the opinion hold by some philosophers.

The Great Singularity Debate

The below video discussion gives a rough overview of the opinions hold at the extreme ends of the spectrum and the subsequent confusion that arises if they clash.

The Singularity and the outer limits of physical possibility (08:38)
Do human brains run software? (09:58)
Consciousness, intelligence, and computation (03:14)
What could minds be made of? (13:08)
Is mind-uploading a dualist dream? (19:18)
Would the Singularity be a Vonnegut-style catastrophe? (10:56)

Simulated Gold

Let’s assume that we wanted to simulate gold, what does that mean?

If we were going to simulate a representation of the chemical properties of gold on a computer, would we be able to trade it on the gold market, establish a gold reserve or use it to create jewellery? Obviously not, but why? Some important characteristics seem to be missing. We do not assign the same value to a representation of gold that we assign to gold itself.

What would it take to simulate the missing properties? A particle accelerator or nuclear reactor:

The artificial production of gold is the age-old dream of the alchemists. It is possible in particle accelerators or nuclear reactors, although the production cost is currently many times the market price of gold. Since there is only one stable gold isotope, 197Au, nuclear reactions must create this isotope in order to produce usable gold.

That we know every physical fact about gold doesn’t make us own any gold.

Consequently, we need to reproduce gold to get gold, no simulation apart from the creation of the actual physically identical substance will do the job.

Emulations represent not reproduce

  • Emulations only exhibit emulated behavior.
  • Emulations only exhibit a representation of the behavior of the physical systems they are emulating.
  • Emulations are only able to emulate analogous behavior of physical systems given an equally emulated environment.

Imagine 3 black boxes, each of them containing a quantum-level emulation of some existing physical system. Two boxes contain the emulations of two different human beings and one box the emulation of an environment.

Assume that if we were to connect all 3 black boxes and observe the behavior of the two humans and their interactions we would be able to verify that the behavior of the humans, including their utterances, would equal that of the originals.

If one was to disconnect one of the black boxes containing the emulation of a human and store it within the original physical environment, replacing one original human being while retaining the other original human being, the new system would not exhibit the same behavior as either the system of black boxes or the genuinely physical system.

A compound system made up of black boxes containing emulations of physical objects and genuinely physical objects does neither equal a system solely made up of black boxes nor a system made up of the original physical objects alone.

The representations of the original physical systems that are being emulated within the black boxes are one level removed from the originals. A composition of those levels will exhibit a different interrelationship.

Once we enable the black box to interact with the higher level in which it resides, the compound system made up of the black box, the original environment and the human being (representation-level ++ physical-level ++ physical-level) will approach the behavior exhibited in the context of an emulated system (representation-level ++ representation-level ++ representation-level) and by the original physical system (physical-level / physical-level / physical-level).

How do we make a compound system made up of representations and originals approach the behavior of the original physical system?

We could equip the black box with sensors and loudspeakers yet it will not exhibit the same behavior. We could further equip it with an avatar. Still, the original and emulated human will treat an avatar differently than another original, respectively emulated human. We could give it a robot body. The behavior will still not equal the behavior that the original physical system would exhibit and neither the behavior that would be exhibited in the context of a system made up of emulations.

We may continue to tweak what was once the black box containing an emulation of a human being. But as we approach a system that will exhibit the same behavior as the original system we are slowly reproducing the original human being, we are turning the representation into a reproduction.

Conclusion

What many philosophers seem to be thinking is that the nature of “fire” can not be captured by an equation. The basic disagreement seems to be that a representation is distinct from a reproduction, that there is a crucial distinction between software and hardware.

For computer scientists the difference between a mechanical device, a physical object and software is that the latter is the symbolic (formal language) representation of the former. Software is just the static description of the dynamic state sequence exhibited by an object. One can then use that software (algorithm) and some sort of computational hardware and evoke the same dynamic state sequence so that the machine (computer) mimics the relevant characteristics of the original object.

Philosophers seem to agree about the difference between a physical thing and its mathematical representation but they don’t agree that we can represent the most important characteristic as long as we do not reproduce the physical substrate. This position is probably best represented by the painting La trahison des images. It is a painting of a pipe. It represents a pipe but it is not a pipe, it is an image of a pipe.

Why would people concerned with artificial intelligence care about all this? That is up to the importance and nature of consciousness and to what extent general intelligence is dependent upon the brain as a biological substrate and its properties (e.g. the chemical properties of carbon versus silicon).


Tags: , ,

Problems in Ethics

— Louis CK, on why his life is evil because he drives an infinity while people are dying (comedy).

If EDR were accepted, speculations about infinite scenarios, however unlikely and far‐fetched, would come to dominate our ethical deliberations. We might become extremely concerned with bizarre possibilities in which, for example, some kind of deity exists that will use its infinite powers to good or bad ends depending on what we do. No matter how fantastical any such scenario would be, if it is a logically coherent and imaginable possibility it should presumably be assigned a finite positive probability, and according to EDR, the smallest possibility of infinite value would smother all other considerations of mere finite values.

[...]

Suppose that I know that a certain course of action, though much less desirable in every other respect than an available alternative, offers a one‐in‐a‐million chance of avoiding catastrophe involving x people, where x is finite. Whatever else is at stake, this possibility will overwhelm my calculations so long as x is large enough. Even in the finite case, therefore, we might fear that speculations about low‐probability‐high‐stakes scenarios will come to dominate our moral decision making if we follow aggregative consequentialism.

The Infinitarian Challenge to Aggregative Ethics

In Derek Parfit’s original formulation the Repugnant Conclusion is characterized as follows: “For any possible population of at least ten billion people, all with a very high quality of life, there must be some much larger imaginable population whose existence, if other things are equal, would be better even though its members have lives that are barely worth living” (Parfit 1984). The Repugnant Conclusion highlights a problem in an area of ethics which has become known as population ethics. The last three decades have witnessed an increasing philosophical interest in questions such as “Is it possible to make the world a better place by creating additional happy people?” and “Is there a moral obligation to have children?” The main problem has been to find an adequate theory about the moral value of states of affairs where the number of people, the quality of their lives, and their identities may vary. Since, arguably, any reasonable moral theory has to take these aspects of possible states of affairs into account when determining the normative status of actions, the study of population ethics is of general import for moral theory. As the name indicates, Parfit finds the Repugnant Conclusion unacceptable and many philosophers agree. However, it has been surprisingly difficult to find a theory that avoids the Repugnant Conclusion without implying other equally counterintuitive conclusions. Thus, the question as to how the Repugnant Conclusion should be dealt with and, more generally, what it shows about the nature of ethics has turned the conclusion into one of the cardinal challenges of modern ethics.

The Repugnant Conclusion (Wikipedia: Mere addition paradox)

The utility monster is a thought experiment in the study of ethics. It was created by philosopher Robert Nozick in 1974 as a criticism of utilitarianism.

In the thought experiment, a hypothetical being is proposed who receives as much or more utility from each additional unit of a resource he consumes as the first unit he consumes. In other words, the utility monster is not subject to diminishing marginal returns with regard to utility, but instead experiences constant marginal returns, or even increasing marginal returns.

Since ordinary people receive less utility with each additional unit consumed, if the utility monster existed, it would justify the mistreatment and perhaps annihilation of everyone else, according to the doctrine of utilitarianism.

Utility monster

The nonidentity problem probes some of our most intuitive beliefs regarding the moral status of acts whose effects are restricted to persons who, at the time the act is performed, do not yet but will exist. As we try to articulate just when, and why, some such future-directed acts are wrong, we find ourselves forced to think carefully about the structure of moral law: is it “person-affecting” in nature or is it “impersonal” in nature? Can, in other words, an act that affects no person who does or ever will exist for the worse be wrong? Or is the wrongness of any particular act dependent (at least in part) on something beyond what that act does, or can be expected to do, to any such person?

The Nonidentity Problem

Problems in rationality

The ‘expected value’ of the game is the sum of the expected payoffs of all the consequences. Since the expected payoff of each possible consequence is $1, and there are an infinite number of them, this sum is an infinite number of dollars. A rational gambler would enter a game iff the price of entry was less than the expected value. In the St. Petersburg game, any finite price of entry is smaller than the expected value of the game. Thus, the rational gambler would play no matter how large the finite entry price was. But it seems obvious that some prices are too high for a rational agent to pay to play. Many commentators agree with Hacking’s (1980) estimation that “few of us would pay even $25 to enter such a game.” If this is correct—and if most of us are rational—then something has gone wrong with the standard decision-theory calculations of expected value above. This problem, discovered by the Swiss eighteenth-century mathematician Daniel Bernoulli is the St. Petersburg paradox. It’s called that because it was first published by Bernoulli in the St. Petersburg Academy Proceedings (1738; English trans. 1954).

The St. Petersburg Paradox

The most common formalizations of Occam’s Razor, Solomonoff induction and Minimum Description Length, measure the program size of a computation used in a hypothesis, but don’t measure the running time or space requirements of the computation.  What if this makes a mind vulnerable to finite forms of Pascal’s Wager? A compactly specified wager can grow in size much faster than it grows in complexity.  The utility of a Turing machine can grow much faster than its prior probability shrinks.

Pascal’s Mugging: Tiny Probabilities of Vast Utilities

For a more concise analysis of the problem see this PDF by Nick Bostrom.

So right now you’ve got an 80% probability of living 10^^10 years.  But if you give me a penny, I’ll tetrate that sucker!  That’s right – your lifespan will go to 10^^(10^^10) years!  That’s an exponential tower (10^^10) tens high!  You could write that as 10^^^3, by the way, if you’re interested.  Oh, and I’m afraid I’ll have to multiply your survival probability by 99.99999999%.

What?  What do you mean, no?  The benefit here is vastly larger than the mere 10^^(2,302,360,800) years you bought previously, and you merely have to send your probability to 79.999999992% instead of 10-1000 to purchase it!  Well, that and the penny, of course.  If you turn down this offer, what does it say about that whole road you went down before?  Think of how silly you’d look in retrospect!  Come now, pettiness aside, this is the real world, wouldn’t you rather have a 79.999999992% probability of living 10^^(10^^10) years than an 80% probability of living 10^^10 years?  Those arrows suppress a lot of detail, as the saying goes!  If you can’t have Significantly More Fun with tetration, how can you possibly hope to have fun at all?

Hm?  Why yes, that’s right, I am going to offer to tetrate the lifespan and fraction the probability yet again… I was thinking of taking you down to a survival probability of 1/(10^^^20), or something like that… oh, don’t make that face at me, if you want to refuse the whole garden path you’ve got to refuse some particular step along the way.

Wait!  Come back!  I have even faster-growing functions to show you!  And I’ll take even smaller slices off the probability each time!  Come back!

The Lifespan Dilemma

Conclusion

I haven’t come across any good reasons to believe that the aforementioned problems do not constitute a reductio ad absurdum of rationality and ethics. There are many other problems, some of which I haven’t read up on so far and probably many more that I don’t know about. But the above problems are insofar special as the methods leading up to them may in principle be “correct” but nonetheless lead to seemingly absurd or undesirable consequences.

I am not referring to the weirdness of the conclusions but the foreseeable scope of the consequences of being wrong about them. I have a very bad feeling about using the implied scope of certain conclusions to outweigh their low probability. I feel we should put more weight to the consequences of our conclusions being wrong than being right.

I can’t justify this, but an example would be quantum suicide. I wouldn’t commit quantum suicide even given a high confidence in the many-worlds interpretation of quantum mechanics being true. Logical implications just don’t seem enough in some cases.

To be clear, extrapolations work and often are the best we can do. But since there are problems such as the above, that we perceive to be undesirable and that lead to absurd consequences, I think it is reasonable to ask for some upper and lower bounds regarding the use and scope of certain heuristics.

We are not going to stop pursuing whatever terminal goal we have chosen just because someone promises us even more utility if we do what that person wants. We are not going to stop loving our girlfriend just because there are other people who do not approve our relationship and who together would experience more happiness if we divorced than the combined happiness of us and our girlfriend being in love. Therefore we already informally established some upper and lower bounds.

Maybe I am simply biased and have been unable to overcome it yet. But my best guess right now is that it is always ethically indifferent what we do and that we simply have to draw a lot of arbitrary lines and arbitrarily refuse some steps. I have read about people who went all batshit crazy taking ideas in ethics and rationality too seriously. That way madness lies, and I am not willing to choose that path yet.

Taking into account considerations of vast utility or low probability quickly leads to chaos theoretic considerations like the butterfly effect. As a computationally bounded and psychical unstable agent I am unable to cope with that. Consequently I see no other way than to neglect the moral impossibility of extreme uncertainty.

Until the above problems are resolved, or sufficiently established, I will continue to put vastly more weight on empirical evidence and my intuition than on logical implications, if only because I still lack the necessary educational background to trust my comprehension and judgement of the various underlying concepts and methods used to arrive at those implications.

Further reading

GiveWell, the SIAI and risks from AI

Objections to Coherent Extrapolated Volition

Moral Impossibility in the Petersburg Paradox : A Literature Survey and Experimental Evidence

Constraints and Animals

The Terrible, Horrible, No Good, Very Bad Truth about Morality and What to Do About it

The Paradoxes of Future Generations and Normative Theory

Future generations: A challenge for moral theory

The person-affecting restriction, comparativism, and the moral status of potential people.


Tags: , , , , ,

In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.

— Eliezer Yudkowsky, May 2004, Coherent Extrapolated Volition

Foragers versus industry era folks

Consider the difference between a hunter-gatherer, who cares about his hunting success and to become the new tribal chief, and a modern computer scientist who wants to determine if a “sufficiently large randomized Conway board could turn out to converge to a barren ‘all off’ state.”

The utility of the success in hunting down animals and proving abstract conjectures about cellular automata is largely determined by factors such as your education, culture and environmental circumstances. The same forager who cared to kill a lot of animals, to get the best ladies in its clan, might have under different circumstances turned out to be a vegetarian mathematician solely caring about his understanding of the nature of reality. Both sets of values are to some extent mutually exclusive or at least disjoint. Yet both sets of values are what the person wants, given the circumstances. Change the circumstances dramatically and you change the persons values.

What do you really want?

You might conclude that what the hunter-gatherer really wants is to solve abstract mathematical problems, he just doesn’t know it. But there is no set of values that a person “really” wants. Humans are largely defined by the circumstances they reside in. If you already knew a movie, you wouldn’t watch it. To be able to get your meat from the supermarket changes the value of hunting.

If “we knew more, thought faster, were more the people we wished we were, and had grown up closer together” then we would stop to desire what we learnt, wish to think even faster, become even different people and get bored of and rise up from the people similar to us.

A singleton is an attractor

A singleton will inevitably change everything by causing a feedback loop between itself as an attractor and humans and their values.

Much of our values and goals, what we want, are culturally induced or the result of our ignorance. Reduce our ignorance and you change our values. One trivial example is our intellectual curiosity. If we don’t need to figure out what we want on our own, our curiosity is impaired.

A singleton won’t extrapolate human volition but implement an artificial set values as a result of abstract high-order contemplations about rational conduct.

With knowledge comes responsibility, with wisdom comes sorrow

Knowledge changes and introduces terminal goals. The toolkit that is called ‘rationality’, the rules and heuristics developed to help us to achieve our terminal goals are also altering and deleting them. A stone age hunter-gatherer seems to possess very different values than we do. Learning about rationality and various ethical theories such as Utilitarianism would alter those values considerably.

Rationality was meant to help us achieve our goals, e.g. become a better hunter. Rationality was designed to tell us what we ought to do (instrumental goals) to achieve what we want to do (terminal goals). Yet what actually happens is that we are told, that we will learn, what we ought to want.

If an agent becomes more knowledgeable and smarter then this does not leave its goal-reward-system intact if it is not especially designed to be stable. An agent who originally wanted to become a better hunter and feed his tribe would end up wanting to eliminate poverty in Obscureistan. The question is, how much of this new “wanting” is the result of using rationality to achieve terminal goals and how much is a side-effect of using rationality, how much is left of the original values versus the values induced by a feedback loop between the toolkit and its user?

Take for example an agent that is facing the Prisoner’s dilemma. Such an agent might originally tend to cooperate and only after learning about game theory decide to defect and gain a greater payoff. Was it rational for the agent to learn about game theory, in the sense that it helped the agent to achieve its goal or in the sense that it deleted one of its goals in exchange for a allegedly more “valuable” goal?

Beware rationality as a purpose in and of itself

It seems to me that becoming more knowledgeable and smarter is gradually altering our utility functions. But what is it that we are approaching if the extrapolation of our volition becomes a purpose in and of itself? Extrapolating our coherent volition will distort or alter what we really value by installing a new cognitive toolkit designed to achieve an equilibrium between us and other agents with the same toolkit.

Would a singleton be a tool that we can use to get what we want or would the tool use us to do what it does, would we be modeled or would it create models, would we be extrapolating our volition or rather follow our extrapolations?

Further reading

Why I am skeptical of risks from AI

GiveWell, the SIAI and risks from AI


Tags: , , , , , ,

As we know,
There are known knowns.
There are things
We know we know.
We also know
There are known unknowns.
That is to say
We know there are some things
We do not know.
But there are also unknown unknowns,
The ones we don’t know
We don’t know.

— Donald Rumsfeld, Feb. 12, 2002, Department of Defense news briefing

Intelligence, a cornucopia?

It seems to me that those who believe into the possibility of catastrophic risks from artificial intelligence act on the unquestioned assumption that intelligence is kind of a black box, a cornucopia that can sprout an abundance of novelty. But this implicitly assumes that if you increase intelligence you also decrease the distance between discoveries.

Intelligence is no solution in itself, it is merely an effective searchlight for unknown unknowns and who knows that the brightness of the light increases proportionally with the distance between unknown unknowns? To enable an intelligence explosion the light would have to reach out much farther with each increase in intelligence than the increase of the distance between unknown unknowns. I just don’t see that to be a reasonable assumption.

Intelligence amplification, is it worth it?

It seems that if you increase intelligence you also increase the computational cost of its further improvement and the distance to the discovery of some unknown unknown that could enable another quantum leap. It seems that you need to apply a lot more energy to get a bit more complexity.

If any increase in intelligence is vastly outweighed by its computational cost and the expenditure of time needed to discover it then it might not be instrumental for a perfectly rational agent (such as an artificial general intelligence), as imagined by game theorists, to increase its intelligence as opposed to using its existing intelligence to pursue its terminal goals directly or to invest its given resources to acquire other means of self-improvement, e.g. more efficient sensors.

What evidence do we have that the payoff of intelligent, goal-oriented experimentation yields enormous advantages (enough to enable an intelligence explosion) over evolutionary discovery relative to its cost?

We simply don’t know if intelligence is instrumental or quickly hits diminishing returns.

Can intelligence be effectively applied to itself at all? How do we know that any given level of intelligence is capable of handling its own complexity efficiently? Many humans are not even capable of handling the complexity of the brain of a worm.

Humans and the importance of discovery

There is a significant difference between intelligence and evolution if you apply intelligence to the improvement of evolutionary designs:

  • Intelligence is goal-oriented.
  • Intelligence can think ahead.
  • Intelligence can jump fitness gaps.
  • Intelligence can engage in direct experimentation.
  • Intelligence can observe and incorporate solutions of other optimizing agents.

But when it comes to unknown unknowns, what difference is there between intelligence and evolution? The critical similarity is that both rely on dumb luck when it comes to genuine novelty. And where else but when it comes to the dramatic improvement of intelligence itself does it take the discovery of novel unknown unknowns?

We have no idea about the nature of discovery and its importance when it comes to what is necessary to reach a level of intelligence above our own, by ourselves. How much of what we know was actually the result of people thinking quantitatively and attending to scope, probability, and marginal impacts? How much of what we know today is the result of dumb luck versus goal-oriented, intelligent problem solving?

Our “irrationality” and the patchwork-architecture of the human brain might constitute an actual feature. The noisiness and patchwork architecture of the human brain might play a significant role in the discovery of unknown unknowns because it allows us to become distracted, to leave the path of evidence based exploration.

A lot of discoveries were made by people who were not explicitly trying to maximizing expected utility. A lot of progress is due to luck, in the form of the discovery of unknown unknowns.

A basic argument in support of risks from superhuman intelligence is that we don’t know what it could possible come up with. That is also why it is called it a “Singularity“. But why does nobody ask how a superhuman intelligence knows what it could possible come up with?

It is not intelligence in and of itself that allows humans to accomplish great feats. Even people like Einstein, geniuses who were apparently able to come up with great insights on their own, were simply lucky to be born into the right circumstances, the time was ripe for great discoveries, thanks to previous discoveries of unknown unknowns.

Evolution versus Intelligence

It is argued that the mind-design space must be large if evolution could stumble upon general intelligence and that there are low-hanging fruits that are much more efficient at general intelligence than humans are, evolution simply went with the first that came along. It is further argued that evolution is not limitlessly creative, each step must increase the fitness of its host, and that therefore there are artificial mind designs that can do what no product of natural selection could accomplish.

I agree with the above, yet given all of the apparent disadvantages of the blind idiot God, evolution was able to come up with altruism, something that works two levels above the individual and one level above society. So far we haven’t been able to show such ingenuity by incorporating successes that are not evident from an individual or even societal position.

The example of altruism provides evidence that intelligence isn’t many levels above evolution. Therefore the crucial question is, how great is the performance advantage? Is it large enough to justify the conclusion that the probability of an intelligence explosion is easily larger than 1%? I don’t think so. To answer this definitively we would have to fathom the significance of the discovery (“random mutations”) of unknown unknowns in the dramatic amplification of intelligence versus the invention (goal-oriented “research and development”) of an improvement within known conceptual bounds.

Another example is flight. Artificial flight is not even close to the energy efficiency and maneuverability of birds or insects. We didn’t went straight from no artificial flight towards flight that is generally superior to the natural flight that is an effect of biological evolution.

Dragonfly

Take for example a dragonfly. Even if we were handed the design for a perfect artificial dragonfly, minus the design for the flight of a dragonfly, we wouldn’t be able to build a dragonfly that can take over the world of dragonflies, all else equal, by means of superior flight characteristics.

It is true that a Harpy Eagle can lift more than three-quarters of its body weight while the Boeing 747 Large Cargo Freighter has a maximum take-off weight of almost double its operating empty weight (I suspect that insects can do better). My whole point is that we never reached artificial flight that is strongly above the level of natural flight. An eagle can after all catch its cargo under various circumstances like the slope of a mountain or from beneath the sea, thanks to its superior maneuverability.

Humans are biased and irrational

It is obviously true that our expert systems are better than we are at their narrow range of expertise. But that expert systems are better at certain tasks does not imply that you can effectively and efficiently combine them into a coherent agency.

The noisiness of the human brain might be one of the important features that allows it to exhibit general intelligence. Yet the same noise might be the reason that each task a human can accomplish is not put into execution with maximal efficiency. An expert system that features a single stand-alone ability is able to reach the unique equilibrium for that ability. Whereas systems that have not fully relaxed to equilibrium feature the necessary characteristics that are required to exhibit general intelligence. In this sense a decrease in efficiency is a side-effect of general intelligence. If you externalize a certain ability into a coherent framework of agency, you decrease its efficiency dramatically. That is the difference between a tool and the ability of the agent that uses the tool.

In the above sense, our tendency to be biased and act irrationally might partly be a trade off between plasticity, efficiency and the necessity of goal-stability.

Embodied cognition and the environment

Another problem is that general intelligence is largely a result of an interaction between an agent and its environment. It might be in principle possible to arrive at various capabilities by means of induction, but it is only a theoretical possibility given unlimited computational resources. To achieve real world efficiency you need to rely on slow environmental feedback and make decision under uncertainty.

AIXI is often quoted as a proof of concept that it is possible for a simple algorithm to improve itself to such an extent that it could in principle reach superhuman intelligence. AIXI proves that there is a general theory of intelligence. But there is a minor problem, AIXI is as far from real world human-level general intelligence as an abstract notion of a Turing machine with an infinite tape is from a supercomputer with the computational capacity of the human brain. An abstract notion of intelligence doesn’t get you anywhere in terms of real-world general intelligence. Just as you won’t be able to upload yourself to a non-biological substrate because you showed that in some abstract sense you can simulate every physical process.

Just imagine you emulated a grown up human mind and it wanted to become a pick up artist, how would it do that with an Internet connection? It would need some sort of avatar, at least, and then wait for the environment to provide a lot of feedback.

Therefore even if we’re talking about the emulation of a grown up mind, it will be really hard to acquire some capabilities. Then how is the emulation of a human toddler going to acquire those skills? Even worse, how is some sort of abstract AGI going to do it that misses all of the hard coded capabilities of a human toddler?

Can we even attempt to imagine what is wrong about a boxed emulation of a human toddler, that makes it unable to become a master of social engineering in a very short time?

Can we imagine what is missing that would enable one of the existing expert systems to quickly evolve vastly superhuman capabilities in its narrow area of expertise? Why haven’t we seen a learning algorithm teaching itself chess intelligence starting with nothing but the rules?

In a sense an intelligent agent is similar to a stone rolling down a hill, both are moving towards a sort of equilibrium. The difference is that intelligence is following more complex trajectories as its ability to read and respond to environmental cues is vastly greater than that of a stone. Yet intelligent or not, the environment in which an agent is embedded plays a crucial role. There exist a fundamental dependency on unintelligent processes. Our environment is structured in such a way that we use information within it as an extension of our minds. The environment enables us to learn and improve our predictions by providing a testbed and a constant stream of data.

Necessary resources for an intelligence explosion

If artificial general intelligence is unable to seize the resources necessary to undergo explosive recursive self-improvement then the ability and cognitive flexibility of superhuman intelligence in and of itself, as characteristics alone, would have to be sufficient to self-modify its way up to massive superhuman intelligence within a very short time.

Without advanced real-world nanotechnology it will be considerable more difficult for an AGI to undergo quick self-improvement. It will have to make use of existing infrastructure, e.g. buy stocks of chip manufactures and get them to create more or better CPU’s. It will have to rely on puny humans for a lot of tasks. It won’t be able to create new computational substrate without the whole economy of the world supporting it. It won’t be able to create an army of robot drones overnight without it either.

Doing so it would have to make use of considerable amounts of social engineering without its creators noticing it. But, more importantly, it will have to make use of its existing intelligence to do all of that. The AGI would have to acquire new resources slowly, as it couldn’t just self-improve to come up with faster and more efficient solutions. In other words, self-improvement would demand resources. The AGI could not profit from its ability to self-improve regarding the necessary acquisition of resources to be able to self-improve in the first place.

Therefore the absence of advanced nanotechnology constitutes an immense blow to the possibility of explosive recursive self-improvement and risks from AI in general.

One might argue that an AGI will solve nanotechnology on its own and find some way to trick humans into manufacturing a molecular assembler and grant it access to it. But this might be very difficult.

There is a strong interdependence of resources and manufacturers. The AGI won’t be able to simply trick some humans to build a high-end factory to create computational substrate, let alone a molecular assembler. People will ask questions and shortly after get suspicious. Remember, it won’t be able to coordinate a world-conspiracy, it hasn’t been able to self-improve to that point yet because it is still trying to acquire enough resources, which it has to do the hard way without nanotech.

Anyhow, you’d probably need a brain the size of the moon to effectively run and coordinate a whole world of irrational humans by intercepting their communications and altering them on the fly without anyone freaking out.

People associated with the SIAI would at this point claim that if the AI can’t make use of nanotechnology it might make use of something we haven’t even thought about. But what, magic?

Artificial general intelligence, a single break-through?

Another point to consider when talking about risks from AI is how quickly the invention of artificial general intelligence will take place. What evidence do we have that there is some principle that, once discovered, allows us to grow superhuman intelligence overnight?

If the development of AGI takes place slowly, a gradual and controllable development, we might be able to learn from small-scale mistakes while having to face other risks in the meantime. This might for example be the case if intelligence can not be captured by a discrete algorithm, or is modular, and therefore never allow us to reach a point where we can suddenly build the smartest thing ever that does just extend itself indefinitely.

To me it doesn’t look like that we will come up with artificial general intelligence quickly, but rather that we will have to painstakingly optimize our expert systems step by step over long periods of times.

Paperclip maximizers

It is claimed that an artificial general intelligence might wipe us out inadvertently while undergoing explosive recursive self-improvement to more effectively pursue its terminal goals. I think that it is unlikely that most AI designs will not hold.

I agree with the argument that any AGI that isn’t made to care about humans won’t care about humans. But I also think that the same argument applies for spatio-temporal scope boundaries and resource limits. Even if the AGI is not told to hold, e.g. compute as many digits of Pi as possible, I consider it an far-fetched assumption that any AGI intrinsically cares to take over the universe as fast as possible to compute as many digits of Pi as possible. Sure, if all of that are presuppositions then it will happen, but I don’t see that most of all AGI designs are like that. Most that have the potential for superhuman intelligence, but who are given simple goals, will in my opinion just bob up and down as slowly as possible.

Complex goals need complex optimization parameters (the design specifications of the subject of the optimization process against which it will measure its success of self-improvement).

Even the creation of paperclips is a much more complex goal than telling an AI to compute as many digits of Pi as possible.

For an AGI, that was designed to design paperclips, to pose an existential risk, its creators would have to be capable enough to enable it to take over the universe on its own, yet forget, or fail to, define time, space and energy bounds as part of its optimization parameters. Therefore, given the large amount of restrictions that are inevitably part of any advanced general intelligence, the nonhazardous subset of all possible outcomes might be much larger than that where the AGI works perfectly yet fails to hold before it could wreak havoc.

Fermi paradox

The Fermi paradox does allow for and provide the only conclusions and data we can analyze that amount to empirical criticism of concepts like that of a Paperclip maximizer and general risks from superhuman AI’s with non-human values without working directly on AGI to test those hypothesis ourselves.

If you accept the premise that life is not unique and special then one other technological civilisation in the observable universe should be sufficient to leave potentially observable traces of technological tinkering.

Due to the absence of any signs of intelligence out there, especially paper-clippers burning the cosmic commons, we might conclude that unfriendly AI could not be the most dangerous existential risk that we should worry about.

Summary

In principle we could build antimatter weapons capable of destroying worlds, but in practise it is much harder to accomplish.

There are many question marks when it comes to the possibility of superhuman intelligence, and many more about the possibility of recursive self-improvement. Most of the arguments in favor of those possibilities solely derive their appeal from being vague.

Further reading


Tags: , , , ,

It seems to me that all the ways in which we disagree have more to do with philosophy (how to quantify uncertainty; how to deal with conjunctions; how to act in consideration of low probabilities) [...] we are not dealing with well-defined or -quantified probabilities. Any prediction can be rephrased so that it sounds like the product of indefinitely many conjunctions. It seems that I see the “SIAI’s work is useful scenario” as requiring the conjunction of a large number of questionable things [...]

— Holden Karnofsky, 6/24/11 (GiveWell interview with major SIAI donor Jaan Tallinn, PDF)

Disjunctive arguments

People associated with the Singularity Institute for Artificial Intelligence (SIAI) like to claim that the case for risks from AI is supported by years worth of disjunctive lines of reasoning. This basically means that there are many reasons to believe that humanity is likely to be wiped out as a result of artificial general intelligence. More precisely it means that not all of the arguments supporting that possibility need to be true, even if all but one are false risks from AI are to be taken seriously.

The idea of disjunctive arguments is formalized by what is called a logical disjunction. Consider two declarative sentences, A and B. The truth of the conclusion (or output) that follows from the sentences A and B does depend on the truth of A and B. In the case of a logical disjunction the conclusion of A and B is only false if both A and B are false, otherwise it is true. Truth values are usually denoted by 0 for false and 1 for true. A disjunction of declarative sentences is denoted by OR or ∨ as an infix operator. For example, (A(0)∨B(1))(1), or in other words, if statement A is false and B is true then what follows is still true because statement B is sufficient to preserve the truth of the overall conclusion.

Generally there is no problem with disjunctive lines of reasoning as long as the conclusion itself is sound and therefore in principle possible yet in demand of at least one of several causative factors to become actual. I don’t perceive this to be the case for risks from AI. I agree that there are many ways in which artificial general intelligence (AGI) could be dangerous, but only if I accept several presuppositions regarding AGI that I actually dispute.

By presuppositions I mean requirements that need to be true simultaneously (in conjunction). A logical conjunction is only true if all of its operands are true. In other words, the a conclusion might require all of the arguments leading up to it to be true, otherwise it is false. A conjunction is denoted by AND or ∧.

Now consider the following prediction: <Mary is going to buy one of thousands of products in the supermarket.>

The above prediction can be framed as a disjunction: Mary is going to buy one of thousands of products in the supermarket, 1.) if she is hungry 2.) if she is thirsty 3.) if she needs a new coffee machine. Only one of the 3 given possible arguments need to be true in order to leave the overall conclusion to be true, that Mary is going shopping. Or so it seems.

The same prediction can be framed as a conjunction: Mary is going to buy one of thousands of products in the supermarket 1.) if she has money 2.) if she has some needs 3.) if the supermarket is open. All of the 3 given factors need to be true in order to render the overall conclusion to be true.

That a prediction is framed to be disjunctive does not speak in favor of the possibility in and of itself. I agree that it is likely that Mary is going to visit the supermarket if I accept the hidden presuppositions. But a prediction is only at most as probable as its basic requirements. In this particular case I don’t even know if Mary is a human or a dog, a factor that can influence the probability of the prediction dramatically.

The same is true for risks from AI. The basic argument in favor of risks from AI is that of an intelligence explosion, that intelligence can be applied to itself in an iterative process leading to ever greater levels of intelligence. In short, artificial general intelligence will undergo explosive recursive self-improvement.

Hidden complexity

Explosive recursive self-improvement is one of the presuppositions for the possibility of risks from AI. The problem is that this and other presuppositions are largely ignored and left undefined. All of the disjunctive arguments put forth by the SIAI are trying to show that there are many causative factors that will result in the development of unfriendly artificial general intelligence. Only one of those factors needs to be true for us to be wiped out by AGI. But the whole scenario is at most as probable as the assumption hidden in the words <artificial general intelligence> and <explosive recursive self-improvement>.

<Artificial General Intelligence> and <Explosive Recursive Self-improvement> might appear to be relatively simple and appealing concepts. But most of this superficial simplicity is a result of the vagueness of natural language descriptions. Reducing the vagueness of those concepts by being more specific, or by coming up with technical definitions of each of the words they are made up of, reveals the hidden complexity that is comprised in the vagueness of the terms.

If we were going to define those concepts and each of its terms we would end up with a lot of additional concepts made up of other words or terms. Most of those additional concepts will demand explanations of their own made up of further speculations. If we are precise then any declarative sentence (P#) (all of the terms) used in the final description will have to be true simultaneously (P#∧P#). And this does reveal the true complexity of all hidden presuppositions and thereby influence the overall probability, P(risks from AI) = P(P1∧P2∧P3∧P4∧P5∧P6∧…). That is because the conclusion of an argument that is made up of a lot of statements (terms) that can be false is more unlikely to be true since complex arguments can fail in a lot of different ways. You need to support each part of the argument that can be true or false and you can therefore fail to support one or more of its parts, which in turn will render the overall conclusion false.

To summarize: If we tried to pin down a concept like <Explosive Recursive Self-Improvement> we would end up with requirements that are strongly conjunctive.

Making numerical probability estimates

But even if the SIAI was going to thoroughly define those concepts, there is still more to the probability of risks from AI than the underlying presuppositions and causative factors. We also have to integrate our uncertainty about the very methods we used to come up with those concepts, definitions and our ability to make correct predictions about the future and integrate all of it into our overall probability estimates.

Take for example the following contrived quote:

We have to take over the universe to save it by making the seed of an artificial general intelligence, that is undergoing explosive recursive self-improvement, extrapolate the coherent volition of humanity, while acausally trading with other superhuman intelligences across the multiverse.

Although contrived, the above quote does only comprise actual beliefs hold by people associated with the SIAI. All of those beliefs might seem somewhat plausible inferences and logical implications of speculations and state of the art or bleeding edge knowledge of various fields. But should we base real-life decisions on those ideas, should we take those ideas seriously? Should we take into account conclusions whose truth value does depend on the conjunction of those ideas? And is it wise to make further inferences on those speculations?

Let’s take a closer look at the necessary top-level presuppositions to take the above quote seriously:

  1. The many-worlds interpretation
  2. Belief in the Implied Invisible
  3. Timeless Decision theory
  4. Intelligence explosion

1: Within the lesswrong/SIAI community the many-worlds interpretation of quantum mechanics is proclaimed to be the rational choice of all available interpretations. How to arrive at this conclusion is supposedly also a good exercise in refining the art of rationality.

2: P(Y|X) ≈ 1, then P(X∧Y) ≈ P(X)

In other words, logical implications do not have to pay rent in future anticipations.

3: “Decision theory is the study of principles and algorithms for making correct decisions—that is, decisions that allow an agent to achieve better outcomes with respect to its goals.”

4: “Intelligence explosion is the idea of a positive feedback loop in which an intelligence is making itself smarter, thus getting better at making itself even smarter. A strong version of this idea suggests that once the positive feedback starts to play a role, it will lead to a dramatic leap in capability very quickly.”

To be able to take the above quote seriously you have to assign a non-negligible probability to the truth of the conjunction of #1,2,3,4, 1∧2∧3∧4. Here the question is not not only if our results are sound but if the very methods we used to come up with those results are sufficiently trustworthy. Because any extraordinary conclusions that are implied by the conjunction of various beliefs might outweigh the benefit of each belief if the overall conclusion is just slightly wrong.

Not enough empirical evidence

Don’t get me wrong, I think that there sure are convincing arguments in favor of risks from AI. But do arguments suffice? Nobody is an expert when it comes to intelligence. My problem is that I fear that some convincing blog posts written in natural language are simply not enough.

Just imagine that all there was to climate change was someone who never studied the climate but instead wrote some essays about how it might be physical possible for humans to cause a global warming. If the same person then goes on to make further inferences based on the implications of those speculations, am I going to tell everyone to stop emitting CO2 because of that? Hardly!

Or imagine that all there was to the possibility of asteroid strikes was someone who argued that there might be big chunks of rocks out there which might fall down on our heads and kill us all, inductively based on the fact that the Earth and the moon are also a big rocks. Would I be willing to launch a billion dollar asteroid deflection program solely based on such speculations? I don’t think so.

Luckily, in both cases, we got a lot more than some convincing arguments in support of those risks.

Another example: If there were no studies about the safety of high energy physics experiments then I might assign a 20% chance of a powerful particle accelerator destroying the universe based on some convincing arguments put forth on a blog by someone who never studied high energy physics. We know that such an estimate would be wrong by many orders of magnitude. Yet the reason for being wrong would largely be a result of my inability to make correct probability estimates, the result of vagueness or a failure of the methods I employed to come up with those estimates. The reason for being wrong by many orders of magnitude would have nothing to do with the arguments in favor of the risks, as they might very well be sound given my epistemic sate and the prevalent uncertainty.

I believe that mere arguments in favor of one risk do not suffice to neglect other risks that are supported by other kinds of evidence. I believe that logical implications of sound arguments should not reach out indefinitely and thereby outweigh other risks whose implications are fortified by empirical evidence. Sound arguments, predictions, speculations and their logical implications are enough to demand further attention and research, but not much more.

Logical implications

Artificial general intelligence is already an inference made from what we currently believe to be true, going a step further and drawing further inferences from previous speculations, e.g. explosive recursive self-improvement, is in my opinion a very shaky business.

What would happen if we were going to let logical implications of vast utilities outweigh other concrete near-term problems that are based on empirical evidence? Insignificant inferences might exhibit hyperbolic growth in utility: 1.) There is no minimum amount of empirical evidence necessary to extrapolate the expected utility of an outcome. 2.) The extrapolation of counterfactual alternatives is unbounded, logical implications can reach out indefinitely without ever requiring new empirical evidence.

Hidden disagreement

All of the above hints at a general problem that is the reason for why I think that discussions between people associated with the SIAI, its critics and those who try to evaluate the SIAI, won’t lead anywhere. Those discussions miss the underlying reason for most of the superficial disagreement about risks from AI, namely that there is no disagreement about risks from AI in and of itself.

There are a few people who disagree about the possibility of AGI in general, but I don’t want to touch on that subject in this post. I am trying to highlight the disagreement between the SIAI and people who accept the notion of artificial general intelligence. With regard to those who are not skeptical of AGI the problem becomes more obvious when you turn your attention to people like John Baez organisations like GiveWell. Most people would sooner question their grasp of “rationality” than give five dollars to a charity that tries to mitigate risks from AI because their calculations claim it was “rational” (those who have read the article by Eliezer Yudkowsky on Pascal’s Mugging know that I used a statement from that post and slightly rephrased it). The disagreement all comes down to a general averseness to options that have a low probability of being factual, even given that the stakes are high.

Nobody is so far able to beat arguments that bear resemblance to Pascal’s Mugging. At least not by showing that it is irrational to give in from the perspective of a utility maximizer. One can only reject it based on a strong gut feeling that something is wrong. And I think that is what many people are unknowingly doing when they argue against the SIAI or risks from AI. They are signaling that they are unable to take such risks into account. What most people mean when they doubt the reputation of people who claim that risks from AI need to be taken seriously, or who say that AGI might be far off, what those people mean is that risks from AI are too vague to be taken into account at this point, that nobody knows enough to make predictions about the topic right now.

When GiveWell, a charity evaluation service, interviewed the SIAI (PDF), they hinted at the possibility that one could consider the SIAI to be a sort of Pascal’s Mugging:

GiveWell: OK. Well that’s where I stand – I accept a lot of the controversial premises of your mission, but I’m a pretty long way from sold that you have the right team or the right approach. Now some have argued to me that I don’t need to be sold – that even at an infinitesimal probability of success, your project is worthwhile. I see that as a Pascal’s Mugging and don’t accept it; I wouldn’t endorse your project unless it passed the basic hurdles of credibility and workable approach as well as potentially astronomically beneficial goal.

This shows that lot of people do not doubt the possibility of risks from AI but are simply not sure if they should really concentrate their efforts on such vague possibilities.

Technically, from the standpoint of maximizing expected utility, given the absence of other existential risks, the answer might very well be yes. But even though we believe to understand this technical viewpoint of rationality very well in principle, it does also lead to problems such as Pascal’s Mugging. But it doesn’t take a true Pascal’s Mugging scenario to make people feel deeply uncomfortable with what Bayes’ Theorem, the expected utility formula, and Solomonoff induction seem to suggest one should do.

Again, we currently have no rational way to reject arguments that are framed as predictions of worst case scenarios that need to be taken seriously even given a low probability of their occurrence due to the scale of negative consequences associated with them. Many people are nonetheless reluctant to accept this line of reasoning without further evidence supporting the strong claims and request for money made by organisations such as the SIAI.

Here is what mathematician and climate activist John Baez has to say:

Of course, anyone associated with Less Wrong would ask if I’m really maximizing expected utility. Couldn’t a contribution to some place like the Singularity Institute of Artificial Intelligence, despite a lower chance of doing good, actually have a chance to do so much more good that it’d pay to send the cash there instead?

And I’d have to say:

1) Yes, there probably are such places, but it would take me a while to find the one that I trusted, and I haven’t put in the work. When you’re risk-averse and limited in the time you have to make decisions, you tend to put off weighing options that have a very low chance of success but a very high return if they succeed. This is sensible so I don’t feel bad about it.

2) Just to amplify point 1) a bit: you shouldn’t always maximize expected utility if you only live once. Expected values — in other words, averages — are very important when you make the same small bet over and over again. When the stakes get higher and you aren’t in a position to repeat the bet over and over, it may be wise to be risk averse.

3) If you let me put the $100,000 into my retirement account instead of a charity, that’s what I’d do, and I wouldn’t even feel guilty about it. I actually think that the increased security would free me up to do more risky but potentially very good things!

All this shows that there seems to be a fundamental problem with the formalized version of rationality. The problem might be human nature itself, that some people are unable to accept what they should do if they want to maximize their expected utility. Or we are missing something else and our theories are flawed. Either way, to solve this problem we need to research those issues and thereby increase the confidence in the very methods used to decide what to do about risks from AI, or to increase the confidence in risks from AI directly, enough to make it look like a sensible option, a concrete and discernable problem that needs to be solved.

Many people perceive the whole world to be at stake, either due to climate change, war or engineered pathogens. Telling them about something like risks from AI, even though nobody seems to have any idea about the nature of intelligence, let alone general intelligence or the possibility of recursive self-improvement, seems like just another problem, one that is too vague to outweigh all the other risks. Most people feel like having a gun pointed to their heads, telling them about superhuman monsters that might turn them into paperclips then needs some really good arguments to outweigh the combined risk of all other problems.

But there are many other problems with risks from AI. To give a hint at just one example: if there was a risk that might kill us with a probability of .7 and another risk with .1 while our chance to solve the first one was .0001 and the second one .1, which one should we focus on? In other words, our decision to mitigate a certain risk should not only be focused on the probability of its occurence but also on the probability of success in solving it. But as I have written above I believe that the most pressing issue is to increase the confidence into making decisions under extreme uncertainty or to reduce the uncerainty itself.


Tags: , , , , ,

Google Streetview ist nun auch mit 20 deutschen Städten online. Leider ist Gütersloh noch nicht dabei. Sehr schade ist natürlich, dass es auch in den jetzt zur Verfügung stehenden Städten einige wenige aber doch auffällige Lücken gibt. Diesen Versuch der Unterhöhlung des öffentlichen Raums lässt sich allerdings entgegenwirken und der Schaden zum Teil rückgängig machen. Es gibt gesetzliche (legale) Möglichkeiten die Unkenntlichmachung und Zensur von Google Streetview aufzuheben bzw. zu umgehen.

Wie gesagt, es ist zum Glück möglich, den durch die Paranoia der Deutschen (Medien) ausgelösten Zensurwahn von Google Streetview rückgängig zu machen. Es besteht die Möglichkeit zur Bereitstellung privater Streetview-Daten mittels der Foto-Sharing-Webseite Panoramio. Anders ausgedrückt lassen sich private georeferenzierte Fotografien zu Panoramio umsonst hochladen, also Fotos, die Daten wie die Aufnahmeposition (GPS Ortsinformationen) oder die Ausrichtung der Kamera im dreidimensionalen Raum beinhalten. Dieser Dienst fügt solche Fotos dann nach einiger Zeit als Ebene zu Google Maps (auch Google Earth) hinzu und integriert sie in Google Streetview.

Street View Verpixelung durch Panoramio Umgehen

Street View Verpixelung durch Panoramio Umgehen

Einige weiterführende Links zum Thema:

Zum Schluß hier noch ein paar nette Kommentare (Tweets) zum deutschen Start von Goolge Streetview und der damit einhergehenden Zensur:

RT @CineKie: Wer sein Haus bei Google #StreetView verpixeln lässt, gibt viel mehr über seine Person preis, als es die Fassade je hätte tun können.

RT @andiliciouscom: Genial! > Muhaha! Wie sinnvoll die Verpixelung in #StreetView ist,sieht man an dieser Stelle wunderbar: http://maps.google.com/maps… (RT @haascore)

RT @Balkonschlaefer: Wer sein Haus unkenntlich gemacht haben will sollte Christo und nicht Google rufen. #streetview

RT @weckgeschnappt: Verpixelte Häuser in der Nachbarschaft? http://www.computerbild.de/artikel… “So laden Sie eigene Bilder bei Panoramio hoch” ;) #streetview

P.S.

Wenn euch der obenstehende Text ein bisschen komisch anmutet oder übertrieben erscheint, ihr habt ja Recht. Es dient dazu möglichst viele Suchbegriffe abzudecken und somit viele Leute auf die geschilderte Möglichkeit aufmerksam zu machen.

Nachtrag (2010-11-20):

Nachtrag (2010-11-22):

Google scheint wohl zu viel Angst vor den 3% der Bevölkerung zu haben, die alle anderen ihre Paranoia aufzwingen wollen. Oder vielleicht doch eher vor den alten und mitlerweile in einer Existenzkrise befindlichen Medien, wie dem WDR (mit vom Austerben bedrohter Zuschauerschaft), der in der naiven Annahme damit Quoten zu machen, ständig Schwachsinn berichtet und so Stunk gegen Google macht? Ist auch egal, spätestens wenn u.a. Google bald mit ihrem Internet Fernsehen auf dem Markt kommen, wird das sowieso ein jehes Ende nehmen. Also genug davon, jedenfalls hat Google schnell seine Einbindung von Panoramio geändert. Man kann damit zwar immer noch die Zensur umgehen, muss allerdings nun auf das Foto klicken. Die schöne Integration von privaten Fotos zu einer eigenen Street View Ebene ist erst mal in der ursprünglichen Form nicht mehr zu sehen. Ändert aber nicht viel, nachfolgend ein paar Screenshots von Google Earth und Google Maps:

Google Earth (3D / Street View / Panoramio)

Google Streetview & Panoramio

Ich werde euch auf den laufenden halten und diese Post aktualisieren, wenn ich etwas Neues erfahre.


Tags: , , , , , , , ,

« Older entries

Get Adobe Flash playerPlugin by wpburn.com wordpress themes