Slate Magazine has published an article on Roko’s basilisk.
I especially like the following quote, which captures the problem very well:
I worry less about Roko’s Basilisk than about people who believe themselves to have transcended conventional morality.
Wait, you thought the problem is Roko’s basilisk? Not at all. It’s just one crazy thought experiment that no sane person takes seriously. The real problem, the problem which Roko’s basilisk highlights, is the hazardous mindset propagated by LessWrong. A mindset that fosters such crazy ideas.
You might at this point object that LessWrong does not take Roko’s basilisk seriously. Maybe most members don’t. Yet all of the premises leading up to Roko’s basilisk are propagated by LessWrong.
So why do I call LessWrong’s mindset dangerous? Easy! Consider a world in which everyone adopted the mindset promoted by LessWrong, a mindset which in essence boils down to an awfully naive mix of consequentialism and expected utility maximization, in conjunction with a belief in the implied invisible (logical implications). In such a world people would make decisions that are influenced by the following beliefs:
(1) The ability to shut up and multiply, to trust the math even when it feels wrong is a key rationalist skill.
(2) It is a moral imperative to cause harm to a minority if the expected benefit for the majority is large enough.
(3) If a future galactic civilization could hypothetically depend on your current decisions, then you need to account for its expected value in your calculations, and draw action relevant conclusions from these calculations.
I believe that a world in which this mindset spreads is a world where more atrocities, and more wars, happen.
Just consider there was not just one group like Machine Intelligence Research Institute (MIRI), but thousands. Thousands of groups who want to save the world from non-evidence-backed speculations on worst possible outcomes by implementing schemes which could potentially have a global influence. If just one of them turns out to be worse than what it tries to fix, e.g. a failed geoengineering project, then billions might die.
But don’t get the wrong impression. I am not against consequentialism. My opinion is indeed based on a consequentialist conclusion. I am against people who try to maximize their influence based on unstable back-of-the-envelope calculations, without appropriately discounting such calculations. This will lead to superficially correct but flawed decisions, such as “let’s steal the poor guy’s organs so that we can save a few better people.”
More specifically, I believe it to be rational to maximize exploration by trying to make your calculations as robust as possible. I believe that arguments based on armchair theorizing should be strongly discounted. I believe that people should seek to make their hypotheses falsifiable, and should put most weight on empirical evidence.
But this is not what LessWrong or MIRI seem to be doing. Instead they focus on vague subjects that are difficult or impossible to adequately test, subjects such as evolutionary psychology, interpretations of quantum mechanics, or baseless speculations about superintelligences.
Roko’s basilisk is just one great example that highlights how LessWrong’s decision theory and epistemology is broken. It is an example of how they fail to say “Oops”, of how they fail to go back to the drawing board
I could write a lot more on this, and I already have, but this post is already too long for the purpose of posting a link. But I felt it was necessary to provide some explanation. Because a lot of people associated with LessWrong don’t understand what’s wrong with LessWrong, and claim that critics are just using Roko’s basilisk to discredit LessWrong. No, you got that wrong! Roko’s basilisk is the reductio ad absurdum of everything that LessWrong stands for.
I just want to provide one last example, other than Roko’s basilisk, of what can happen when you take the LessWrong mindset seriously, and take it to its logical conclusion. A talk by Jaan Tallinn (transcript), a major donor of the Machine Intelligence Research Institute:
This talk combines the ideas of intelligence explosion, the multiverse, the anthropic principle, and the simulation argument, into an alternative model of the universe – a model where, from the perspective of a human observer, technological singularity is the norm, not the exception.
A quote from the talk by Jaan Tallinn:
We started by observing that living and playing a role in the 21st century seems to be a mind-boggling privilege, because the coming singularity might be the biggest event in the past and future history of the universe. Then we combined the computable multiverse hypothesis with the simulation argument, to arrive at the conclusion that in order to determine how special our century really is, we need to count both the physical and virtual instantiations of it.
We further talked about the motivations of post-singularity superintelligences, speculating that they might want to use simulations as a way to get in touch with each other. Finally we analyzed a particular simulation scenario in which superintelligences are searching for one another in the so called mind space, and found that, indeed, this search should generate a large number of virtual moments near the singularity, thus reducing our surprise in finding ourselves in one.
In other words, combine a lot of vague, highly conjunctive, and non-evidence-backed speculations into a model of the universe that suits you best.
LessWrong stands for rationality. Which is fine. It stands for consequentialism. Also fine. The problem is that they pervert these fine ideas and instead promote an extreme and naive overcompensatation against what they deem irrational. This leads them to completely disregard common sense in favor of approximations of theoretically correct concepts that break human brains. The results are flawed ideas such as Roko’s basilisk, or talks such as the one given by Jaan Tallinn.
It’s troublesome how ambiguous the signals are that LessWrong is sending on some issues.
On the one hand LessWrong says that you should shut up and multiply, trust the math even when it feels wrong. On the other hand Yudkowsky writes that he would sooner question his grasp of “rationality” than give five dollars to a Pascal’s Mugger because he thought it was “rational”.
On the one hand LessWrong says that whoever knowingly chooses to save one life, when they could have saved two – to say nothing of a thousand lives, or a world – they have damned themselves as thoroughly as any murderer. On the other hand Yudkowsky writes that ends don’t justify the means for humans.
On the one hand LessWrong stresses the importance of acknowledging a fundamental problem and saying “Oops”. On the other hand Yudkowsky tries to patch a framework that is obviously broken.
Anyway, I worry that the overall message that LessWrong sends is that of naive consequentialism, and decision making based on back-of-the-envelope calculations, rather than the meta-level consequentialism that contains itself when faced with too much uncertainty, and which focuses on obtaining robust beliefs that are backed by empirical evidence.
Yudkowsky might write that ends don’t justify the means for humans. But he also believes that such deontological prohibitions do not apply to artificial intelligences. And since Yudkowsky believes that he is a complete strategic altruist, and that it is his moral obligation to build a friendly AI, indirectly he still ends up advocating actions in order to achieve goals that he deems to be altruistic. In other words, he himself might not kill one person to save two, but he wants to create an AI that would do so. Which isn’t very reassuring.