Yvain on Roko’s basilisk

Note: I am not sure, but it seems likely (90%), that the following comment was made by the LessWrong top contributor Yvain.

Link: rationalwiki.org/wiki/Talk:LessWrong

I (and most other LWers) don’t find “the basilisk” nearly as interesting as people at RationalWiki seem to. It’s basically a really clever re-imagining of Pascal’s Wager. Pascal’s Wager is kind of weak, but there are stronger versions (see “Pascal’s Mugging”) that are hard to pick apart logically. Nevertheless, most people have enough common sense not to take Pascal’s Wager seriously even if they can’t point to the exact logical flaws.

People who tragically lack common sense and compensate by making decisions based on pure reason (eg some Less Wrongers) are especially vulnerable to Pascal-type arguments. I know of a couple of people linked to the community who have actually converted to Christianity or Islam based on the Wager, and other people who haven’t gone that far but are at genuinely bothered by it.

So coming up with a really clever re-imagining of Pascal’s Wager targeted at exactly the community containing the people most vulnerable to being mentally screwed up by Pascalesque arguments is a dick move. It’s especially a dick move if part of the argument is that only people who have read the argument are going to suffer the eternal torture. It’s especially a dick move if you then immediately post it on the vulnerable community so everyone there can see how clever you are. Eliezer understandably got really angry at Roko and deleted the entire thing. Every so often there are vague rumors that someone actually took Roko’s Wager seriously and got panicked by it, but it’s always “a friend of a friend”. Overall I think he’s just angry that someone is deliberately spreading information designed to make people panicked and upset, the same way I might be angry if someone started waving posters of goatse around in a church.

If you do not know what this is all about, see here.

Ignoring common sense is basically what I have been talking about all the time.

Regarding Eliezer Yudkowsky’s decisions on how to handle Roko’s Wager. Banning any discussion of an idea is known to spread it. But more importantly, as I have already argued, it can give even more credence to an idea whose hazardous effect is in the first place a result of an unjustified stamp of credence.

If Eliezer Yudkowsky was really interested to protect gullible people from an irrational idea then he should go ahead and openly dismiss it as insane and possibly even dissolve the problem once and for all.

It is utterly irresponsible to try to protect people who are scared of ghosts and spirits by banning all discussions of how it is irrational to fear those ideas.

I believe that the real reason for his decision to ban all discussion of Roko’s basilisk is rather that he is simply unable to disavow the idea without having his whole worldview come crashing down as a result or admit that the best he can do is to act based on intuition rather than pure reason or to instead go batshit insane and give in to some sort of Pascal’s mugging.


Tags:

  • Mitchell Porter

    The basilisk will only be slain within the LW world, once someone proves a theorem of TDT which shows that humans just can’t enter into the sort of acausal trade presupposed by that scenario. I may post about this on my own “basilisk blog”, one day.

  • http://kruel.co/ Alexander Kruel

    What do you reckon would happen if it is instead proven that the whole idea makes sense from a TDT perspective? Would certain people go completely bonkers?

  • http://kruel.co/ Alexander Kruel

    I don’t think it is necessary to rely on any sort of formal proof here. Proof or not, you’ll never e.g. encounter Omega offering you a choice of two boxes. In the same sense you’ll never have to choose between torture and dust specks.

    People have to realize that decision theories are merely squiggles on paper and not binding laws. People have to realize that thought experiments are distinct from reality. People have to realize the importance of saying “oops” when they encounter a reductio ad absurdum.

    As David Gerard wrote,

    TDT seems to suffer the same problem as arithmetical utilitarianism: it keeps leading to absurd results – but instead of going “these results are absurd, does this actually work?” they go “these results are absurd, therefore the absurdity is really important!!!” How long does it take to say “oops”, as EY puts it?

  • Mitchell Porter

    But it just doesn’t make sense, for quite elementary reasons, e.g. the human being who thinks they are being blackmailed by the future AI they are imagining, is not in a position to actually know that the future AI will exist! So anyone who thought that they were actually experiencing acausal blackmail of this sort, would necessarily just be imagining it.

  • http://kruel.co/ Alexander Kruel

    I thought the argument is of probabilistic nature and derives its importance from the expected utility associated with it.That it is possible that some sort of AI might exist, somewhere, sometime, which might run a simulation of you to see if you thought about acausal trade scenarios relevant to its own interests. If it discovers that it is likely that you indeed thought about relevant acausal trade scenarios it will act accordingly. That is if the expected utility of being a trustworthy trading partner, a partner which does precommit not to break such deals, is positive.

  • Mitchell Porter

    Well, there are a zillion variations on all these scenarios… But in any case of acausal trade, issues of knowledge are paramount, if the trade is supposed to be rational. The parties to the trade must somehow know about each other, or at least have genuine reason to think each other’s existence is probable. And that is already a huge stumbling block.

    And then there’s the problem of selectively focusing on some possibilities and not others – there’s always a second “possible AI” which will react to your actions in the opposite way to the first AI.

  • dmytryl

    Formal proof? For LW word? Just write a paper with some equations and dubious arguments, that would be more than enough.

    Let’s look at it with a little sanity. There’s 2 classes of possible AIs: sane AIs that won’t torture people who haven’t helped them come around, and vindictive AIs that will. The vindictive AI, once they come around, are actually losers – they are wasting resources torturing people on whom this whole torture argument is empirically known not to work. The very people who demonstrably failed to implement the ” if TDT outputs torture then donate to SI end ” algorithm.

    The proof that TDT does this, or that TDT does that, has only relation to the answer to the question of how ironic is the world non changing work of a bunch of uneducated crackpots (SI). It has very little relation to the question which class of AI is more likely, due to extreme unlikelihood of the uneducated crackpots creating an AI. (Worrying about sufficiently dumb people creating an AI is akin to worrying about bunch of monkeys in IDEs programming an AI). It doesn’t change the ratio between vindictive and non vindictive AIs a whole lot. And even if TDT doesn’t torture anyone, you can come up with Torture Decision Theory that does.

    Furthermore, from the selfish AI perspective, giving you candy or torturing you are equivalently undesirable. Due to the religious background of these folks, it does seem plausible to them that you could somehow anger the god into torturing you. But to the same people it does not seem plausible that you could somehow force the AI into giving you something you want by doing some math on the paper and then depending on the outcome donating or not donating.

  • Mitchell Porter

    I have a higher estimation of the value of SI’s goals, and its capacity for making real discoveries, than you do. Unfriendly AI is a real possibility; decision theory, “mathematical AI” like AIXI, etc, are not pseudoscience; and the cognitive science of human decision-making is probably a necessary object of study, if we do want to have an AI that is “friendly” to human values. So they have a lot of conceptual infrastructure that ought to be relevant to solving a real problem.

    But they have bedazzled themselves by opting for the spectacular science-fiction possibility in many questions – thus, many-worlds interpretation of quantum mechanics; utilitarian calculations which assume that the whole future light-cone of Earth will be employed in the service of whatever value system wins out in the posthuman world; and, the vision of a multiverse populated by superintelligences who engage in acausal interactions through mutual simulation. In each case, there is some question (correct physical ontology, future history, cognitive demographics of the universe) where – as with the Drake equation – we simply do not possess a reliable way to answer it, and in LW world, in each case a handwaving argument for the spectacular answer is adopted, and this is then regarded as part of their superior rationalist knowledge.

    But this “will to science fiction” is not the only reason that TDT is part of their outlook. It offers an answer to a genuine, widely recognized paradox of decision theory; it’s vaguely concordant with the timeless physical ontology of Julian Barbour; and it seems to promise a rationale for the “superrational” cooperation of Douglas Hofstadter, who via GEB is one of the culture heroes of LW. There are a number of attempts to develop a formalism for reasoning quantitatively about TDT, just as one can reason quantitatively about the implications of ordniary, formal, causal decision theory. So I’d say it’s possible, even likely, that one day the people trying to reason about acausal decision theory will convince themselves, by reasoning within a formalism they devised and which they trust, that most of the spectacular versions of the idea are fallacious.

  • dmytryl

    AIXI and the like are, of course, not pseudoscience.

    Various types of bullshit by ignorant and arrogant people, however, is. The same ignorance underpins this will-to-fiction.

    Let’s pick many worlds interpretation as example. There may be reasons to choose many worlds interpretation. The actual argument given, however, is based on a very stupid misunderstanding of Solomonoff induction and Kolmogorov complexity. Not understood is that the output has to begin with the past sense data – ergo, any valid code has to include collapse or other mechanism for singling out one world as more real than the others. Solomonoff induction was inspired by physics, for god’s sake. It’s because physics is like Solomonoff induction, that we have Copenhagen interpretation.

    With regards to “TDT”, the paper is really really awful, and it represents not so much greater advancement but lower standards.

  • dmytryl

    I wrote an article about their ‘expected utility’ so called estimations:

    https://dmytry.com/texts/On_Utility_of_Incompetent_Efforts.html

  • Pingback: Alexander Kruel · Roko’s Basilisk: Everything you need to know

  • Yvain

    I agree with you that Eliezer making such a big deal out of this was stupid for Streisand Effect related reasons. Eliezer also agrees with this and has admitted it was a dumb move. However, it happened, and short of a time machine it can’t be undone.

    So now the question is whether
    a) we agree Eliezer made a mistake and get on with our lives?
    b) in order to annoy Eliezer we GO ON A HUGE SPREE POSTING THINGS ABOUT ROKO’S BASILISK EVERYWHERE WE CAN because omg it’s so fun to rub Eliezer’s face in it.

    What I don’t understand is the people who feel so sorry for people traumatized by the basilisk when it’s Eliezer’s fault, but see nothing at all wrong with posting ad infinitum about the concept, bringing it to new audiences, and generally acting as its publicity agents.

  • http://kruel.co/ Alexander Kruel

    Yvain, I am not sure how much of what I wrote on this topic you already read. But I outlined a few times how I believe it is correct to talk about this even given the risk that some people might be traumatized. And the main reason is not some “shut up and multiply” attitude where the suffering of a few can be justified by the well-being of the majority.

    Those people is helped best by showing them how not to be traumatized by such ideas and that Eliezer Yudkowsky’s unjustified beliefs do not amount to the kind of extraordinary evidence required to let him shape their actions. Those people need to learn that letting their behavior be influenced by someone claiming that acting in a certain way has huge amounts of negative expected utility is in essence a case of Pascal’s mugging and practically unworkable.

    For more see e.g. the reasons listed here and the discussion here.