existential risks

You are currently browsing articles tagged existential risks.

Cause of this post: the following passage (source),

In November of 2012 I set a goal for myself: find the most x-risk reducing role I can fill. At first I thought it would be by working directly with MIRI, but after a while it became clear that I could contribute more by simply donating. So my goal became: find the highest paying job, so I can donate lots of money to CFAR and MIRI.

Motivation for writing this post: Unclear. Possibly an attempt to remove cognitive load. Further assessment of the underlying motivation is estimated to be more resource expensive than writing the post itself. Future posts are not expected to be triggered by similar motivations. Therefore the expected value of investing the aforementioned resources to further analysis of the underlying motivations is deemed to be unproductive. Everything said so far might partly be rationalization in order to not having to think about the motivation in more detail. At this point further meta evaluation is expected to lead to an infinite regress.

Work put into this post: Quick mind dump.

Epistemic state: Perplexed.


Here is what freaks me out. There are certain very complex issues. For example: (1) what economic model best resembles observed data (2) whether the practical benefits of researching lab-made viruses outweigh the risks of an accidental or deliberate release of a lab-created flu strain (3) the expected value of geoengineering.

For someone to decide #1, and to be confident enough of their ability to judge economic models to subsequently adopt one as a role model in shaping the world, I would at least expect such a person to have studied economics for several years. And even then, based on the complexity of the problem and the frequent failure of experts, calculations of the expected value of taking your model seriously enough to draw action relevant conclusions from it seem to be highly error prone.

Deciding #2 seems to be much more difficult. Studying epidemiology doesn’t seem to be nearly enough to decide what to do in this case. You would need a very good and robust model of applied ethics, rationality and somehow be able to obtain, understand and analyze all the data necessary to evaluate the risk. Which includes such diverse fields as statistics, lab safety, data security and social dynamics. It appears to be nearly impossible for one person to arrive at a definitive conclusion of what to do in this case.

When it comes to #3, a low model uncertainty and an action relevant expected value calculation seem utterly out of reach of any single person. Geoengineering is a very complex climatological, technological, political and ethical issue with far-reaching consequences.

So what about friendly AI? The rationale underlying this issue is an incredibly complex yet vague conjecture about artificial general intelligence, a subject that nobody understands, involving ideas from highly controversial and unsolved fields such as ethics and rationality.

If someone says that they are going to donate lots of money to an organization concerned with researching supposedly <existential risks> associated with <artificial general intelligence> (more here) that is conjectured to be undergoing an <intelligence explosion>, at some unknown point in future, focusing on ensuring some unknown definition of <friendliness>, how likely is it that the person is doing so based on an evidence based and robust expected value calculation?

Almost all of the information available on the underlying issues concerning friendly AI research and the alleged importance of researching the subject have been written by the same people who are asking for money, while the few available opinions of third-party experts are not very favorable. Could anyone have acquired a sufficiently strong grasp of (1) artificial general intelligence (2) ethics (3) rationality, at this point in time, to be confident enough to decide to significantly alter their life by looking for a high paying job in order to support that cause by donating lots of money? I don’t see that at all.

Tags: , ,

Yet another article about existential risks repeating the usual cached thoughts:

People who worry about these things often say that the main threat may come from accidents involving “dumb optimizers” – machines with rather simple goals (producing IKEA furniture, say) that figure out that they can improve their output astronomically by taking control of various resources on which we depend for our survival. Nobody expects an automated furniture factory to do philosophy. Does that make it less dangerous? (Would you bet your grandchildren’s lives on the matter?)

First of all, we are computationally bounded and cannot afford to take into account highly specific, conjunctive, non-evidence-backed speculations on possible bad outcomes. And even if that was feasible, it does not work out in practice.

Anyway, the above quote again exemplifies the dangers of jumping to conclusions. Some sort of black box full of technological magic is conjectured on the basis of which unwarranted assumptions are being inferred which are then subsequently used to draw action relevant conclusions.

To correctly estimate risks associated with artificial intelligence it is important to take into account real world research and development processes and to pinpoint specific failure modes. It is important to narrow down on how specifically an artificial intelligence is supposed to behave in a catastrophic way by taking apart the mode of operation of the magic black box and the assumptions hidden in words such as <artificial general intelligence> and <explosive recursive self-improvement>. It is important to show how specifically it is possible to arrive at such a scenario by avoiding quantum leaps in thinking about complex scenarios and to instead approach those scenarios incrementally to locate the alleged tipping-point where a well-behaved system starts to act in a catastrophic yet highly complex and intelligent way.

How many different scenarios can you come up with where an artificial intelligence causes an extinction type event if you have to do so in an incremental fashion and have to take into account the real-world research and development process leading up to such a system?

Don’t just assume vague ideas such as <explosive recursive self-improvement>, try to approach the idea in a piecewise fashion. Start out with some narrow AI such as IBM Watson or Apple’s Siri, or from scratch if you like, and add various hypothetical self-improvement capabilities, but avoid quantum leaps. Try to locate at what point those systems start acting in an unbounded fashion, possibly influencing the whole world in a catastrophic way. And if you manage to locate such a tipping-point then take it apart even further. Start over and take even smaller steps, be more specific. How exactly did your well-behaved expert system end up being an existential risk?

The purpose is to break free from recalling old conclusions and to start thinking for yourself in a concrete and specific fashion rather than participating in furious handwaving.

At one point software is going to write new, unique and better software all by itself. But that will not happen overnight. There will be a complex developmental and evolutionary process leading up to that outcome. Only if you conjecture the outcome independently of its origin can you imagine a software that makes better software irrespective of human intention.

Tags: ,

There are no alien programs. No programs are generated from random noise. All current software does obey human commands, directly or indirectly. Either those commands are hardcoded or entered later.

Mistakes are being made. Sometimes a program does something that was not intended. Often such failures result in a crash or a different kind of obstruction of the programs own workings. Yet software is constantly improved.

If software wasn’t constantly improved to be better at doing what humans intend it to do, would we then ever be able to reach a level of sophistication where a software could work well enough to outsmart us? To do so it would have to work as intended along a huge number of dimensions.

Avoid quantum leaps, be specific

Imagine some hypothetical railroad management system that “keeps the trains running”. A so called expert system. A narrow artificial intelligence. It keeps the trains on schedule. It checks that no two trains interfere with each other. It analyzes data from sensors attached to the trains that scan the rails for possible weakness or other defects. It even uses cameras to watch railroad crossings for possible obstructions. It further accepts inputs from the train personnel about possible delays or emergencies.

Now suppose the railway company wanted to improve the system and hired an artificial general intelligence (AGI) researcher to do the job.

To detect what exactly might cause the system to behave badly and to not make unwarranted assumptions, or attribute or ascribe human behavior to the process, we’ll assume that the system is improved incrementally rather than being replaced all at once.

The first upgrade is a replacement of the current mainframe with a sophisticated supercomputer. For now this upgrade has no effect since the software hasn’t been changed other than being adapted to use the new computational infrastructure.

The next upgrade concerns the input system that allowed the train personnel to submit delays and emergencies. The previous input method was to press one of two buttons, one for delay and one for emergency. The buttons have been replaced by a microphone that feeds into a sophisticated natural language interpretation module which is able to parse any delay or emergency message uttered in natural language and upon detection return the same data that would have been returned if someone had pressed the buttons instead.

Further upgrades include e.g. an advanced visual pattern recognition module that uses the camera feeds to detect possibly dangerous humans inside the trains or near the railways and notify the railway police and a drone armada roaming the railroad stations to provide service information to humans and watch for security breaches.

At some point the hired AGI researcher decides it is time to implement something more sophisticated. The program will be able to simulate other possible time tables and look for improvements based on previous delays, ticket sales and data from its other sensors such as service requests from people on the railroad stations. If it finds an improved time table it can autonomously decide to use the new time table to test it against the real world and make further improvements.


I think you can see where this is going. You can add further upgrades until the system reaches human or even superhuman capabilities. At one point it would make sense to call it an artificial general intelligence.

I spare myself from writing out further upgrades here. But feel free to continue to do so as long as you are not making any unwarranted, vague, unspecific leaps.

The fog of vagueness

It is incredible easy to simply conjecture that turning any system into, or replacing it with an artificial general intelligence will cause it to go berserk and kill all humans, kill all aliens in the observable universe, hack the matrix to prevent the simulator gods from shutting down the simulation, or give in to the first Pascal’s mugger offering it to “keep the trains running” forever. But once you have to come up with a concrete scenario and outline specifically how that is supposed to happen you’ll notice that you will never actually reach such a tipping point as long as you do not deliberately design the system to behave in such a way. 

The only way you can arrive at any scenario where an artificial general intelligence is going to kill all humans is by being vague and unspecific, by ignoring real world development processes and by using natural language to describe some sort of fantasy scenario and invoke lots of technological magic.

Don’t be fooled by AI risk advocates hiding behind vague assertions. What those people do is to cherry-pick certain science fictional capabilities of a conjectured artificial intelligence while at the same time they completely disregard the developmental stages and evolutionary processes leading up to such an intelligence.

Vagueness Explosion

Take for example the original idea of an intelligence explosion (emphasis mine):

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultra-intelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be left far behind.

— I.J. Good, “Speculations Concerning the First Ultraintelligent Machine”

The whole argument is worthless rubbish because it is unspecific and vague to an extent that allows one to draw completely unwarranted non-evidence based assumptions.

Others are better than me at explaining what is wrong here, so I’ll quote:

More generally, many of the objects demonstrated to be impossible in the previous posts in this series can appear possible as long as there is enough vagueness.  For instance, one can certainly imagine an omnipotent being provided that there is enough vagueness in the concept of what “omnipotence” means; but if one tries to nail this concept down precisely, one gets hit by the omnipotence paradox.  Similarly, one can imagine a foolproof strategy for beating the stock market (or some other zero sum game), as long as the strategy is vague enough that one cannot analyse what happens when that strategy ends up being used against itself.  Or, one can imagine the possibility of time travel as long as it is left vague what would happen if one tried to trigger the grandfather paradox.  And so forth.  The “self-defeating” aspect of these impossibility results relies heavily on precision and definiteness, which is why they can seem so strange from the perspective of vague intuition.

— Terence Tao, “The “no self-defeating object” argument, and the vagueness paradox”

Let’s try to restate I.J. Good’s original idea without some of the vagueness:

Let there be something that can far surpass all the activities of any man. Since design is one of these activities, something better could design even better; there would then unquestionably be an “explosion,” and man would be left far behind.

At best we’re left with a tautology. Nothing more specific can be drawn from the argument than that something that is better is better. No conclusions about the nature of that something can be drawn. Not if it is logical possible at all. Not if it is physical possible, let alone economically realizable. And even if it is possible within the meaning of all of the former definitions, the idea does not provide any insight about how likely it is and at what time we’re going to see an explosion, the nature of the explosion and how it is going to happen. We don’t even know how that initial something that is better is supposed to be created in the first place.

Yet it is possible to use that tautology and extent it indefinitely and use it to infer further speculative conclusions. And if someone has doubts you can just repeat that something that is better is better and the gullible will follow you in droves. But don’t get any more specific or the emptiness of your claims shall be revealed.

Tags: ,


Contributors [ 152 ]

Link: edge.org/responses/q2013

When EDGE asked what we should be worried about, apparently some people used the opportunity to state that we should not worry about artificial intelligence:

“The Singularity”: There’s No There There

Bruce Sterling

Science Fiction and Fantasy Writers of America

So, as a Pope once remarked, “Be not afraid.” We’re getting what Vinge predicted would happen without a Singularity, which is “a glut of technical riches never properly absorbed.” There’s all kinds of mayhem in that junkyard, but the AI Rapture isn’t lurking in there. It’s no more to be fretted about than a landing of Martian tripods.

Super-A.I.s Won’t Rule The World (Unless They Get Culture First)

Andy Clark

Philosopher and Cognitive Scientist, University of Edinburgh; Author: Supersizing the Mind: Embodiment, Action, and Cognitive Extension

The last decades have seen fantastic advances in machine learning and robotics. These are now coupled with the availability of huge and varied databases, staggering memory capacities, and ever-faster and funkier processors. But despite all that, we should not fear that our Artificial Intelligences will soon match and then rapidly outpace human understanding, turning us into their slaves, toys, pets or puppets.

For we humans benefit from one gigantic, and presently human-specific, advantage. That advantage is the huge yet nearly invisible mass of gradually accrued cultural practices and innovations that tweak and pummel the inputs that human brains receive. Those gradually accrued practices are, crucially, delicately keyed to the many initial biases, including especially biases for sociality, play and exploration, installed by the much slower processes of biological evolution. In this way a slowly accumulated mass of well-matched cultural practices and innovations ratchets up human understanding.

But there are also those that do worry:

Life As We Know It

Max Tegmark

Physicist, MIT; Researcher, Precision Cosmology; Scientific Director, Foundational Questions Institute

…if there’s even a 1% chance that there’ll be a singularity in our lifetime, I think a reasonable precaution would be to spend at least 1% of our GDP studying the issue and deciding what to do about it. Yet we largely ignore it, and are curiously complacent about life as we know it getting transformed. What we should be worried about is that we’re not worried.

Unknown Unknowns

Gary Marcus

Cognitive Scientist; Author, Guitar Zero: The New Musician and the Science of Learning

The truth is that we simply don’t know enough about the potential biotechnology, nanotechonology, or future iterations of artificial intelligence to calculate what their risks are, compelling arguments have been made that in principle any of the three could lead to human extinction. These risks may prove manageable, but I don’t think we can manage them if we don’t take them seriously. In the long run, biotech, nanotech and AI are probably significantly more likely to help the species, by increasing productivity and limiting disease, than they are to destroy it. But we need to invest more in figuring out exactly what the risks are, and to better prepare for then. Right now, the US spends more than $2.5 billion dollars a year studying climate change, but (by my informal reckoning) less than 1% of that total studying the risk of biotech, nanotech, and AI.

And most interestingly:

We Are In Denial About Catastrophic Risks

Martin Rees

Former President, The Royal Society; Emeritus Professor of Cosmology & Astrophysics, University of Cambridge; Master, Trinity College; Author, From Here to Infinity

I’m worried that by 2050 desperate efforts to minimize or cope with a cluster of risks with low probability but catastrophic conseqences may dominate the political agenda.

Tags: ,

Imagine a world that was using prediction markets to determine the best policies to adopt.

Scenario 1: 

Suppose that a company like Google, or another big global player, was interested in starting a major artificial general intelligence project.

Would such a player be interested in seeing their project being hampered, slowed down or even disabled by some safety policy? Possibly not. What would be the result in a world run by prediction markets? Such players would possibly bet a lot of money that artificial general intelligence research is safe and that unfriendly AI is unlikely.

Scenario 2:

Suppose some AI risk advocate went forward betting that unfriendly AI is more likely than friendly AI. What incentive would critics have to bet in order not to lose but win money?

At no point would they agree that the prediction has been falsified, because the technological singularity is always near. The next generation of AI might always turn out to be unfriendly.

The prediction is undecidable because those who made the prediction do not anticipate any difference between world states where they are right versus world states where they are wrong. There is no possibility to update on evidence.


When it comes to policies related to existential risks, the technological singularity or other futuristic and undecidable hypotheticals, the incentive provided by prediction markets, money, is rendered mute.

Since prediction markets are broken given important issues such as existential risks, it might actually be detrimental to use them. Because not only will such markets turn out to be a huge money sink when it comes to undecidable predictions, such predictions might turn out to be favored by the owners of those markets to be able to handle ever more money without the risk of having to pay someone off.

Most importantly, rich and possibly biased people and cooperations will manipulate such markets by betting huge amounts of money on undecidable predictions and thereby shifting policies arbitrarily.


I am not against prediction markets. They are a great idea. I am just pointing out some possible problems and especially caution against making very unspecific far-out predictions without any possibility to update on evidence other than the validation of the prediction itself.

Tags: , ,

Newer entries »