Interview with Michael Littman on AI risks

This is a follow-up interview with professor of computer science Michael Littman[1][2] about artificial intelligence and the possible risks associated with it.

The Interview

Q1: You have been an academic in AI for more than 25 years during which time you mainly worked on reinforcement learning.[3][4][5] What are you currently working on and what are your plans for the future?

Michael Littman: My first paper, which I worked on with Dave Ackley in 1989, was called “Learning from natural selection in an artificial environment”. Recently, I’ve started to come back to the question we looked at in that paper—essentially, what should a learning algorithm try to optimize so that the resulting behavior is as “fit” as possible? Most reinforcement-learning research doesn’t make a distinction between the agent’s reward function and its actual task, but Satinder Singh[6] and his colleagues recently provided some evidence that it is conceptually useful to separate these two ideas and ask how to create a reward function that encourages an agent to excel at a task other than the one literally specified by the reward function.

In a way, it is a similar question to the control problem[7], but in a much less sinister context—we need a way of telling machines what we want them to do. I’m focused on end users, people without significant programming experience, and am looking at combinations of inverse reinforcement learning, good interface design, and more natural programming models that are easy to pick up. My collaborators and I are looking at these questions in the context of programming household devices (lights and thermostats) as well as with robots.

Q2: In a previous interview[8] you wrote that P(human extinction caused by badly done AI | badly done AI) is epsilon. You also voiced some skepticism about friendly AI[9] (a machine superintelligence that stably optimizes for humane values). Now that you have read Nick Bostrom’s book[10], ‘Superintelligence: Paths, Dangers, Strategies’, have you learnt something that changed your opinion, or caused you to interpret the questions differently?

Michael Littman: I was very impressed with Nick Bostrom’s book. It’s exquisitely thought out and I found the scope (in terms of coverage of micro and macro scales in both space and time) truly remarkable. That being said, I do not find the central premise—that we are in the process of bringing the ominous owl on the book’s cover into our midst—compelling. Note that I didn’t voice skepticism about friendly AI but about *provably* friendly AI. I’d argue that you can’t prove things about the real world, only about abstractions.

Q3: What is the current level of awareness of Nick Bostrom’s work within the field of AI, or his arguments, and do you recommend that people working to advance artificial intelligence should read his book?

Michael Littman: My guess is that the engagement of most AI researchers is at the level of friends and colleagues alerting them to the highly public statements of notable individuals like Musk (“summoning the demon”)[11] and Gates (“I don’t understand why some people are not concerned”)[12]. I think the field is well aware of the idea of the singularity, but not familiar with the subtleties and the depth of Bostrom’s work in this context. That being said, I do not think mainstream AI research is seriously dabbling with the idea of recursive self improvement[13] and, as such, Bostrom’s book seems like a pretty significant departure from their core interests and direction.

Q4: In an email you wrote that you believe the main disagreement between you and Nick Bostrom et al. to be whether an intelligence explosion[14][15][16][17][18][19][20][21][22][23] is a non-negligible consequence of AI research. In 2011 you wrote that the probability of a human level artificial general intelligence (AGI) to self-modify its way up to massive superhuman intelligence in less than 5 years is essentially zero (Addendum: In a previous interview he also wrote that P(superhuman intelligence within < 5 years | human-level AI running at human-level speed equipped with a 100 Gigabit Internet connection) = 1%, possibly misinterpreting the question I cited as P(superhuman intelligence within < 5 years)). Some people would call you overconfident.[24][25] Can you elaborate on the reasons underlying your estimate?

Michael Littman: I find your use of the word “overconfident” there to be quite interesting. I’m very interested in the problem of AGI and would love to be a part of the community that brings it about. An overconfident person, to me, would be someone who believes he or she can solve this problem in 5 years. More to your point, though, I don’t see massive superhuman intelligence to be something that is meaningful outside a specific cultural context. The development of what we might call massive superhuman intelligence will be an evolutionary process involving changes in the social, physical, and intellectual fabric on which our society is built. Changes like that take time.

Q5: Elon Musk has recently donated $10M to keep AI beneficial.[26] Consider someone whose goal is to maximize how much good they do[27], where “good” is defined as improving the world in order to reduce suffering and help humanity flourish. Do you believe that donating money in order to reduce risks associated with artificial intelligence (not just extinction type risks) might currently be an effective way to accomplish this goal?

Michael Littman: As you know, a number of my colleagues (including my dissertation advisor and many other colleagues for whom I have tremendous respect) signed an open letter[28] hosted by the Future of Life Institute calling for more attention to reducing risks associated with AI. I’ve followed up with a few of them and the most prevalent attitude is that AI, like all technologies, carries significant risks to society. At that level, I agree wholeheartedly that keeping technologists and scientists tuned in to the societal impacts of their work is exceedingly important. So, yes, I feel that supporting research on societal impacts of technology—including artificial intelligence—is a good investment for good.

However, if the risks we’re talking about are of the type detailed in Bostrom’s book—human-independent AI competing directly with humanity for control of our destiny—I don’t think that should be a high priority.

Q6: In another email you wrote that your personal takeaway from all this is to work harder to understand what intelligence *is*. How do you think about using e.g. Hutter’s specification of AIXI[29] as a model for AGI? Or asked more generally, do you think it is possible to work on AGI safety, or a formal definition of it, without researching and advancing AGI at the same time?

Michael Littman: I think the idea of seriously studying AGI safety in the absence of an understanding of AGI is futile. At a high level, raising awareness and scoping out possibilities is fine. But, proposing specific mechanisms for combatting this amorphous threat is a bit like trying to engineer airbags before we’ve thought of the idea of cars. Safety has to be addressed in context and the context we’re talking about is still absurdly speculative.

Q7: D. Scott Phoenix, co-founder of the A.I. startup Vicarious, recently wrote[30] that artificial superintelligence isn’t something that will be created suddenly or by accident. He further wrote that there will be a long iterative process of learning how these systems can be created and the best way to ensure that they are safe. What probability do you assign to the possibility that he is wrong, that either human or superhuman AGI will appear too quickly for us to ensure its safety if we don’t start working on the problem right now? Note that this question pertains whether the initial invention or emergence of AGI will take us by surprise, rather than the speed of its subsequent improvement or self-improvement.

Michael Littman: I agree with the perspective that it’s a long iterative process. I believe that the very notion of what we think intelligence *is* and what it is *for* will evolve significantly through this process. I think we’ll look back on this time much as we look back on earlier times, stunned at the naivety of our working hypotheses and surprised by our obliviousness to the fact that what we now take as a given is not only not given, but flat out wrong. If people are comfortable claiming that we know enough about intelligence today to extrapolate what superintelligence would be, it would be my turn to use the word “overconfident”.

See also

Recent commentary on AI risks by experts and others

Earlier commentary on AI risks








[7] The control problem: how to keep future superintelligences under control. Some AI risk advocates claim that rather than trying to limit what an AI can do, we have to engineer its motivation system in such a way that it would choose not to do harm. One of the reasons underlying this claim is that a superintelligent AI would probably break free from any bonds we construct.







[14] Intelligence Explosion Microeconomics –

[15] Intelligence Explosion: Evidence and Import –

[16] Why an Intelligence Explosion is Probable –

[17] Can Intelligence Explode? –

[18] The Singularity: A Philosophical Analysis –

[19] Cascades, Cycles, Insight… –

[20] …Recursion, Magic –

[21] Recursive Self-Improvement –

[22] Hard Takeoff –

[23] Permitted Possibilities, & Locality –

[24] Suppose that near certainty in your ability to assess a set of propositions equals a 1 in a million chance of being wrong about an assessment of a particular proposition. This means that given a million similar statements, you would have to be correct (on average) about 999999 such assessments while being wrong only once. Can you possibly be this accurate? An amusing example:







Tags: , ,

Comments are now closed.