You are currently browsing articles tagged charity.

Cause of this post: the following passage (source),

In November of 2012 I set a goal for myself: find the most x-risk reducing role I can fill. At first I thought it would be by working directly with MIRI, but after a while it became clear that I could contribute more by simply donating. So my goal became: find the highest paying job, so I can donate lots of money to CFAR and MIRI.

Motivation for writing this post: Unclear. Possibly an attempt to remove cognitive load. Further assessment of the underlying motivation is estimated to be more resource expensive than writing the post itself. Future posts are not expected to be triggered by similar motivations. Therefore the expected value of investing the aforementioned resources to further analysis of the underlying motivations is deemed to be unproductive. Everything said so far might partly be rationalization in order to not having to think about the motivation in more detail. At this point further meta evaluation is expected to lead to an infinite regress.

Work put into this post: Quick mind dump.

Epistemic state: Perplexed.


Here is what freaks me out. There are certain very complex issues. For example: (1) what economic model best resembles observed data (2) whether the practical benefits of researching lab-made viruses outweigh the risks of an accidental or deliberate release of a lab-created flu strain (3) the expected value of geoengineering.

For someone to decide #1, and to be confident enough of their ability to judge economic models to subsequently adopt one as a role model in shaping the world, I would at least expect such a person to have studied economics for several years. And even then, based on the complexity of the problem and the frequent failure of experts, calculations of the expected value of taking your model seriously enough to draw action relevant conclusions from it seem to be highly error prone.

Deciding #2 seems to be much more difficult. Studying epidemiology doesn’t seem to be nearly enough to decide what to do in this case. You would need a very good and robust model of applied ethics, rationality and somehow be able to obtain, understand and analyze all the data necessary to evaluate the risk. Which includes such diverse fields as statistics, lab safety, data security and social dynamics. It appears to be nearly impossible for one person to arrive at a definitive conclusion of what to do in this case.

When it comes to #3, a low model uncertainty and an action relevant expected value calculation seem utterly out of reach of any single person. Geoengineering is a very complex climatological, technological, political and ethical issue with far-reaching consequences.

So what about friendly AI? The rationale underlying this issue is an incredibly complex yet vague conjecture about artificial general intelligence, a subject that nobody understands, involving ideas from highly controversial and unsolved fields such as ethics and rationality.

If someone says that they are going to donate lots of money to an organization concerned with researching supposedly <existential risks> associated with <artificial general intelligence> (more here) that is conjectured to be undergoing an <intelligence explosion>, at some unknown point in future, focusing on ensuring some unknown definition of <friendliness>, how likely is it that the person is doing so based on an evidence based and robust expected value calculation?

Almost all of the information available on the underlying issues concerning friendly AI research and the alleged importance of researching the subject have been written by the same people who are asking for money, while the few available opinions of third-party experts are not very favorable. Could anyone have acquired a sufficiently strong grasp of (1) artificial general intelligence (2) ethics (3) rationality, at this point in time, to be confident enough to decide to significantly alter their life by looking for a high paying job in order to support that cause by donating lots of money? I don’t see that at all.

Tags: , ,

“… pointing out that something scary is possible, is a very different thing from having an argument that it’s likely.”

— Ben Goertzel, The Scary Idea (and Why I Don’t Buy It)




I’m wary of using inferences derived from reasonable but unproven hypothesis as foundations for further speculative thinking and calls for action. Although AI risk advocates[1] do a good job on stating reasons to justify their mission and monetary support, they do neither substantiate their initial premises, to an extent that would allow an outsider to draw action-relevant conclusions, nor do they clarify their predictions in a concise and systematic way. Nevertheless predictions are being made, such as that there is a high likelihood of humanity’s demise given that we develop superhuman artificial general intelligence without first defining mathematically how to prove and guarantee its benevolence. But those predictions are not sufficiently supported, no decision procedure is provided on how to arrive at those conclusions and be sufficiently confided of their correctness. This I believe is unsatisfactory, it lacks transparency and does not allow a reassessment. This is not to say that they are wrong to make predictions, but that although those ideas can very well serve as an urge to caution they are not compelling without further substantiation.

AI risk advocates have to set themselves apart from works of science fiction and actually provide some formal analysis of what we know, what conclusions can be drawn and how they relate to predictions about risks associated with artificial general intelligence. There needs to be a risks benefits analysis that shows why AI risk mitigation is the best charitable cause and a way to reassess the results yourself.


AI risk advocates have created a highly complicated framework of speculations to support and reinforce each other.[2]

Although I can follow much of the reasoning and arguments, I’m currently unable to judge their overall credence. Are the conclusions justified? Are the arguments based on firm ground? Would their arguments withstand a critical inspection or examination by a third party, peer review? Are their estimations reflectively stable? How large is the model uncertainty? There is too much vagueness involved to tell.

Are AI risk advocates able to analyse the reasoning that led them to research friendly AI in the first place, or at least substantiate their estimations with other kinds of evidence than a coherent internal logic?

I’m concerned that, although consistently so, AI risk advocates and their supporters are largely updating on fictional evidence.

This post is meant to inquire about the foundations of their basic premises. Are they creating models to treat subsequent models or are their propositions based on fact?

Most of their arguments are based on a few conjectures and presuppositions about the behavior, drives and motivations of intelligent machines[3] and the use of probability and utility calculations to legitimate action.[4]

Explosive recursive self-improvement[5] is one of those presuppositions. The problem is that this and other presuppositions are largely ignored and left undefined. All of the disjunctive arguments put forth by AI risk advocates are trying to show that there are many causative factors that will result in the development of unfriendly[6] artificial general intelligence. Only one of those factors needs to be true for us to be wiped out by an artificial general intelligence. But the whole scenario is at most as probable as the assumption hidden in the words <artificial general intelligence> and <explosive recursive self-improvement>.

<Artificial General Intelligence> and <Explosive Recursive Self-improvement> might appear to be relatively simple and appealing concepts. But most of this superficial simplicity is a result of the vagueness of natural language descriptions. Reducing the vagueness[7] of those concepts by being more specific, or by coming up with technical definitions[8] of each of the words they are made up of, reveals the hidden complexity[9] that is comprised in the vagueness of the terms.

If we were going to define those concepts, and each of its terms, we would end up with a lot of additional concepts made up of other words or terms. Most of those additional concepts will demand explanations of their own, which will in turn result in even more speculation. If we are precise then any declarative sentence used in the final description will have to be true simultaneously. And this does reveal the true complexity of all hidden presuppositions and thereby influence the overall probability. That is because the conclusion of an argument that is made up of a lot of statements (terms) that can be false is more unlikely to be true, since complex arguments can fail in a lot of different ways. You need to support each part of the argument that can be true or false and you can therefore fail to support one or more of its parts, which in turn will render the overall conclusion false.

If the cornerstone of your argumentation, if one of your basic tenets is the likelihood of explosive recursive self-improvement, although a valid speculation, you are already in over your head with debt. Debt in the form of other kinds of evidence.

I am not to saying that it is a false hypothesis, that it is not even wrong, but that you cannot base a whole movement and a huge framework of further inference and supportive argumentation on such premises, on ideas that are themselves not based on firm ground.

The concept of an intelligence explosion, which is itself a logical implication, should not be used to make further inferences and estimations without additional evidence.

The gist of the matter is that a coherent and consistent framework of sound argumentation based on unsupported inference is nothing more than its description implies. It is fiction.

What I ask for

I would like to see AI risk advocates, or someone who is convinced of the scary idea[10][11][12][13], to publish a paper that states concisely and mathematically (and with possible extensive references if necessary) the decision procedure that led they to devote their life[14][15] to the development of friendly artificial intelligence.[16] I want them to state numeric probability estimates[17] and exemplify their chain of reasoning, how they came up with those numbers and not others by way of sober and evidence backed calculations.[18] I would like to see a precise and compelling review of the methodologies AI risk advocates use to arrive at their conclusions.

Concisely, the paper should account for the following issues and uncertainties:

  • The possibility that superhuman AI (artificial (general) intelligence) is too far away to be considered a risk at this time.
  • The possibility that the capability of AI will improve slowly enough for humans to adapt due various small-scale disasters.
  • The possibility that humans are able to create a provably safe environment to reliable contain any AI and thereby impede uncontrollable self-improvement.
  • The possibility that humans will merge with superhuman tools and become competitive to AI.
  • A comparison with other existential risks[19] and how risks from artificial intelligence[20] outweigh them.
  • Show that AI risk mitigation the best charitable cause and does not increase AI risks.[21]
  • Potential negative consequences of slowing down research on artificial intelligence (a risks and benefits analysis).[22][23]
  • The likelihood of a gradual and controllable development versus the likelihood of an intelligence explosion.[24]
  • The likelihood of unfriendly AI versus friendly AI as the outcome of practical AI research.[25]
  • The ability of superhuman intelligence and cognitive flexibility as characteristics alone to constitute a serious risk given the absence of enabling technologies like advanced nanotechnology.[26]
  • The feasibility of “provably non-dangerous AI”.
  • The disagreement of the overwhelming majority of scientists working on artificial intelligence.[27]
  • That some highly intelligent people who are aware of the position of AI risk advocates do not accept it.[28][29][30][31][32][33][34]
  • Possible conclusions that can be drawn from the Fermi paradox[35] regarding risks associated with superhuman AI versus other potential risks ahead.[36][37]

The paper should further answer the following questions and taboo “intelligence”[38] in doing so:

  • How is an AI going to become a master of dark arts[39] and social engineering[40] in order to persuade and deceive humans?
  • How is an AI going to coordinate a large scale conspiracy or deception, given its initial resources, without making any suspicious mistakes along the way?
  • How is an AI going to hack the Internet to acquire more computational resources?
  • Are those computational resources that can be hacked applicable to improve the general intelligence of an AI?
  • Does throwing more computational resources at important problems, like building new and better computational substrates, allow an AI to come up with better architectures so much faster as to outweigh the expenditure of obtaining those resources, without hitting diminishing returns?
  • Does an increase in intelligence vastly outweigh its computational cost and the expenditure of time needed to discover it?
  • How can small improvements replace conceptual revolutions that require the discovery of unknown unknowns?
  • How does an AI brute-force the discovery of unknown unknowns?
  • Is an agent of a given level of intelligence capable of handling its own complexity efficiently?
  • How is an AI going to predict how improvements, respectively improved versions of itself, are going to act, to ensure that its values are preserved?
  • How is an AI going to solve important problems without real-world experimentation and slow environmental feedback?
  • How is an AI going to build new computational substrates and obtain control of those resources without making use of existing infrastructure?
  • How is an AI going to cloak its actions, i.e. its energy consumption etc.?
  • How is an AI going to stop humans from using its own analytic and predictive algorithms in the form of expert systems to analyze and predict its malicious intentions?
  • How is an AI going to protect itself from human counter strikes given the fragility of the modern world and its infrastructure, e.g. without some sort of shellproof internal power supply?

In addition I would like the paper to include and lay out a formal and systematic summary of what AI risk advocates expect researchers who work on artificial general intelligence to do and why they should do so. I would like to see a clear logical argument for why people working on artificial general intelligence should listen to what AI risk advocates have to say.

“A first step is to ask people what it would take to get them to change their mind. If they refuse to give a straight answer, they can’t be taken seriously.” — John Baez

What would it take to increase my confidence that solving friendly AI is the most important problem humanity faces right now and that everyone should either actively work to solve it or contribute money to that particular cause?

To answer that question I will elaborate on some of the above points:

1.) Evidence that the invention of artificial general intelligence is likely to happen within 50-100 years from now.[41]

In other words, show that superhuman AI is not too far away to be considered a risk at this time.

For example:

  • The existence of a robot that could navigate autonomously in a real-world environment and survive real-world threats and attacks with approximately the skill of C. elegans.
  • A machine that can quickly learn to play Go[42] on its own, unassisted by humans, and beat the best human players.

2.) Evidence that the development of artificial general intelligence will take place quickly rather than gradually and slowly.

In other words, show that the capability of artificial general intelligence will improve quickly enough that humans won’t be able to adapt or learn from their mistakes due various small-scale disasters.

For example:

  • A theorem that there likely exists a information theoretically simple, physically and economically realizable, algorithm that can be improved to self-improve explosively.
  • Prove that there likely are no strongly diminishing intelligence returns for additional compute power.[43]

3.) Prove that other problems or existential risks like global warming or advanced molecular nanotechnology are not more likely to wipe us out before the advent of advanced artificial general intelligence.

For example:

  • Show that advanced molecular nanotechnology does not come first, either by being easier than artificial general intelligence or due to it being a prerequisites for an advanced artificial general intelligence to be invented.

4.) Provide an outline of how an artificial intelligence is going to overpower humanity without filling in any gaps by conjecturing some sort of highly speculative technological magic.

In other words, show how an artificial general intelligence is going to create (or acquire) resources, empowering technologies or civilisatory support.

5.) Provide an outline of how current research is supposed to lead from well-behaved and fine-tuned systems to systems that stop to work correctly in a highly complex and unbounded way.

In other words, show that dangerous recursive self-improvement is the default outcome of the creation of artificial general intelligence.

For example:

  • Show how something like expected utility maximization would actually work out in practice.
  • Conclusive evidence that current research will actually lead to the creation of superhuman AI designs equipped with the relevant drives that are necessary to disregard any explicit or implicit spatio-temporal scope boundaries and resource limits.

6.) Prove that trying to solve friendly AI is decreasing rather than increasing the probability of a negative utility outcome.

For example:

  • Prove that getting friendly AI almost but not quite right won’t be worse than an artificial general intelligence that was not explicitly designed to protect human values.

7.) Provide conclusive evidence that there is anything medium-probable that we can do to mitigate the risks associated with artificial general intelligence.

In other words, show that contributing money can make a difference at this time.

Further Reading

Notes and References



[3] Bostrom on Superintelligence and Orthogonality

[4] A reply to John Baez



[7] The “no self-defeating object” argument, and the vagueness paradox

[8] The Advantages of Being Technical


[10] If anyone who is actively trying to build advanced artificial general intelligence succeeds, we’re highly likely to cause an involuntary end to the human race.

[11] SL4 comment



[14] Video Q&A with Eliezer Yudkowsky

[15] Eliezer Yudkowsky’s advice for Less Wrong readers who want to help save the human race.


[17] “Stop taking the numbers so damn seriously, and think in terms of subjective probability distributions, discard your mental associates between numbers and absolutes, and my choice to say a number, rather than a vague word that could be interpreted as a probability anyway, makes sense. Working on, one of the things I appreciated the most were experts with the intelligence to make probability estimates, which can be recorded, checked, and updated with evidence, rather than vague statements like “pretty likely”, which have to be converted into probability estimates for Bayesian updating anyway. Futurists, stick your neck out! Use probability estimates rather than facile absolutes or vague phrases that mean so little that you are essentially hedging yourself into meaninglessness anyway.” — Michael Anissimov ( mailing list, 2010-07-11)

[18] Asteroid Deflection as a Public Good

[19] Say you believe that unfriendly AI will wipe us out with a probability of 60% and that there is another existential risk that will wipe us out with a probability of 10% even if unfriendly AI turns out to be no risk or in all possible worlds where it comes later. Both risks have the same utility x (if we don’t assume that an unfriendly AI could also wipe out aliens etc.). Thus .6x > .1x. But if the probability of solving friendly AI = A to the probability of solving the second risk = B is A ≤ (1/6)B then the expected utility of mitigating friendly AI is at best equal to the other existential risk because .6Ax ≤ .1Bx.

Consider that one order of magnitude more utility could easily be outweighed or trumped by an underestimation of the complexity of friendly AI.

So how hard is it to solve friendly AI?

Take for example Pascal’s mugging, if you can’t solve it then you need to implement a hack that is largely based on human intuition. Therefore, in order to estimate the possibility of solving friendly AI one needs to account for the difficulty in solving all sub-problems.

Consider that we don’t even know “how one would start to research the problem of getting a hypothetical AGI to recognize humans as distinguished beings.”


[21] By trying to solve friendly AI, AI risk advocates have to think about a lot of issues related to AI in general and might have to solve problems that will make it easier to create artificial general intelligence.

It is far from being clear that organisations working on the problem of AI risks are able to protect their findings against intrusion, betrayal, industrial or espionage.

There further are several possibilities by which AI risk advocates could actually cause a direct increase in negative utility.

1.) Friendly AI is incredible hard and complex. Complex systems can fail in complex ways. Agents that are an effect of evolution have complex values. To satisfy complex values you need to meet complex circumstances. Therefore any attempt at friendly AI, which is incredible complex, is likely to fail in unforeseeable ways. A half-baked, not quite friendly, AI might create a living hell for the rest of time, increasing negative utility dramatically.

2.) Humans are not provably friendly. Given the power to shape the universe certain organisations might fail to act altruistic and deliberately implement an AI with selfish motives or horrible strategies.

[22] Could being overcautious be itself an existential risk that might significantly outweigh the risk(s) posed by the subject of caution? Suppose that most civilizations err on the side of caution. This might cause them to either evolve much slower, so that the chance of a fatal natural disaster to occur before sufficient technology is developed to survive it rises to 100%, or stops them from evolving at all for being unable to prove something being sufficiently safe before trying it and thus never taking the necessary steps to become less vulnerable to naturally existing existential risks. 

[23] Why safety is not safe


[25] Implicit Constraints of Practical Goals: intelligence probably implies benevolence (See also: The Fallacy of Dumb Superintelligence)



[28] Thoughts on the Singularity Institute (SI)




[32] John Baez, What To Do?

[33] Pascal’s scams

[34] How far can AI jump?


[36] SIA says AI is no big threat

[37] The Fermi paradox does allow for and provide the only conclusions and data we can analyze that amount to empirical criticism of concepts like that of a Paperclip maximizer and general risks from superhuman AI’s with non-human values without working directly on artificial general intelligence to test those hypothesis ourselves. If you accept the premise that life is not unique and special then one other technological civilization in the observable universe should be sufficient to leave potentially observable traces of technological tinkering. Due to the absence of any signs of intelligence out there, especially paper-clippers burning the cosmic commons, we might conclude that unfriendly AI could not be the most dangerous existential risk that we should worry about.

[38] If you are unable to answer those questions other than by invoking intelligence as some sort of magic that makes all problems disappear, the scenario that you envision is nothing more than pure fantasy!

You can’t estimate the probability and magnitude of the advantage an AI will have if you are using something that is as vague as the concept of “intelligence”.

Here is a case that bears some similarity and which might shed light on what I am trying to explain:

At his recent keynote speech at the New York Television Festival, former Star Trek writer and creator of the re-imagined Battlestar Galactica Ron Moore revealed the secret formula to writing for Trek.

He described how the writers would just insert “tech” into the scripts whenever they needed to resolve a story or plot line, then they’d have consultants fill in the appropriate words (aka technobabble) later.

“It became the solution to so many plot lines and so many stories,” Moore said. “It was so mechanical that we had science consultants who would just come up with the words for us and we’d just write ‘tech’ in the script. You know, Picard would say ‘Commander La Forge, tech the tech to the warp drive.’ I’m serious. If you look at those scripts, you’ll see that.”

Moore then went on to describe how a typical script might read before the science consultants did their thing:

La Forge: “Captain, the tech is overteching.”

Picard: “Well, route the auxiliary tech to the tech, Mr. La Forge.”

La Forge: “No, Captain. Captain, I’ve tried to tech the tech, and it won’t work.”

Picard: “Well, then we’re doomed.”

“And then Data pops up and says, ‘Captain, there is a theory that if you tech the other tech … ‘” Moore said. “It’s a rhythm and it’s a structure, and the words are meaningless. It’s not about anything except just sort of going through this dance of how they tech their way out of it.”

The use of “intelligence” is as misleading and dishonest in evaluating risks from AI as the use of “tech” in Star Trek.



[41] How far is AGI?


[43] “…there could be non-linear complexity constrains meaning that even theoretically optimal algorithms experience strongly diminishing intelligence returns for additional compute power.” Q&A with Shane Legg on risks from AI

Tags: , , ,

(I originally published this article on, 07 November 2011)

The following is a clipping of a documentary about transhumanism that I recorded when it aired on Arte, September 22 2011.

At the beginning and end of the video Luke Muehlhauser and Michael Anissimov give a short commentary.

Download here: German, French (ask for HD download link). Should play with VLC player.

Sadly, the people who produced the show seemed to be somewhat confused about the agenda of the Singularity Institute. At one point they seem to be saying that the SIAI believes into “the good in the machines”, adding “how naive!”, while the next sentence talks about how the SIAI tries to figure out how to make machines respect humans.

Here is the original part of the clip that I am talking about:

In San Francisco glaubt eine Vereinigung ehrenamtlicher junger Wissenschaftler dennoch an das Gute im Roboter. Wie naiv! Hier im Singularity Institute, dass Kontakte zu den großen Unis wie Oxford hat, zerbricht man sich den Kopf darüber, wie man zukünftigen Formen künstlicher Intelligenz beibringt, den Menschen zu respektieren.

Die Forscher kombinieren Daten aus Informatik und psychologischen Studien. Ihr Ziel: Eine Not-to-do-Liste, die jedes Unternehmen bekommt, das an künstlicher Intelligenz arbeitet.

My translation:

In San Francisco however, a society of young voluntary scientists believes in the good in robots. How naive! Here at the Singularity Institute, which has a connection to big universities like Oxford, they think about how to teach future artificial intelligences to respect humans.

I am a native German speaker by the way, maybe someone else who speaks German can make more sense of it (and is willing to translate the whole clip).

Tags: ,