math

You are currently browsing articles tagged math.

Link: thebigquestions.com/2012/11/14/accounting-for-numbers/

[LessWrong Thread]

Yudkowsky leaps from “the natural numbers can be precisely specified by second order logic” to “the .. study of numbers is equivalent to the logical study of which conclusions follow inevitably from the number-axioms”. This is wrong, wrong, wrong, because second order logic is not logic.

[...] you’re not allowed to set up an axiom system in which all the true theorems of arithmetic are taken as axioms — there is no mechanical procedure for determining whether a given statement is or is not a true theorem of arithmetic (see Tarski’s theorem on the undefinability of truth) and therefore no mechanical procedure for determining what is or is not an axiom in that system. In second-order Peano arithemetic, we have an analogous problem: The axioms can be identified mechanically, but the rules of inference can’t. A properly programmed computer can examine a first-order proof and tell you if it’s valid or not; that is, it can tell you whether each step does in fact follow logically from some of the previous steps. But no computer can do the same for second-order proofs.

So the study of second-order consequences is not logic at all; to tease out all the second-order consequences of your second-order axioms, you need to confront not just the forms of sentences but their meanings. In other words, you have to understand meanings before you can carry out the operation of inference. But Yudkowsky is trying to derive meaning from the operation of inference, which won’t work because in second-order logic, meaning comes first.

[...] it’s important to recognize that Yudkowsky has “solved” the problem of accounting for numbers only by reducing it to the problem of accounting for sets — except that he hasn’t even done that, because his reduction relies on pretending that second order logic is logic.

Tags: , ,

The following is based on the book ‘Das Ziegenproblem‘ by Gero von Randow.

Setup:

  • There are 3 doors.
  • 1 door has a car behind it.
  • 2 doors have a goat behind it.
  • The content behind the doors is randomly chosen.
  • There are 2 candidates, A and B.
  • There is 1 moderator who knows which door has a car behind it.

Actions:

  1. A and B are asked to choose a door and both choose the same door.
  2. The moderator chooses one door which has a goat behind it.
  3. A and B are asked if they would like to switch their choice and pick the remaining door.
  4. A always stays with his choice, the door that has been initially chosen by both A and B.
  5. B always changes her choice to the remaining third door.

Repeat the actions 999 times:

If you repeat the above list of actions 999 times, given the same setup, what will happen?

Candidate A always stays with his initial choice. Which means that he will on average win 1/3 of all games. He will win 1/3*999, 333 cars.

But who won the remaining 666 cars?

Given the setup of the game, the moderator has to choose a door with a goat behind it. Therefore the moderator does win 0 cars.

Candidate B, who always switched her choice, after the moderator picked a door with a goat behind it, must have won the remaining 666 cars (2/3*999)!

1 candidate and 100 doors:

Alter the above setup of the game in the following way

  • There are 100 doors.
  • 1 door has a car behind it.
  • 99 doors have a goat behind it.
  • There is 1 candidate, A.

Alter the above actions in the following way

  • The moderator opens 98 doors with goats behind them.

Now let’s say the candidate picks door number 8. By rule of the game the moderator now has to open 98 of the remaining 99 doors behind which there is no car.

Afterwards there is only one door left besides door 8 that the candidate has chosen.

You would probably switch your choice to the remaining door now. If so, the same should be the case with only 3 doors!

Further explanation:

Your chance of picking the car with your initial choice is 1/3 but your chance of choosing a door with a goat behind it, at the beginning, is 2/3. Thus on average, 2/3 of times that you are playing this game you’ll pick a goat at first go. That also means that 2/3 of times that you are playing this game, and by definition pick a goat, the moderator will have to pick the only remaining goat. Because given the laws of the game the moderator knows where the car is and is only allowed to open a door with a goat in it.

What does that mean?

On average, at first go, you pick a goat 2/3 of the time and hence the moderator is forced to pick the remaining goat 2/3 of the time. That means 2/3 of the time there is no goat left, only the car is left behind the remaining door. Therefore 2/3 of the time the remaining door has the car. Which makes switching the winning strategy.

Further reading

Tags:

There are n = 4 sorts of candy to choose from and you want to buy k = 10 candies. How many ways can you do it?

This is a problem of counting combinations (order does not matter) with repetition (you can choose multiple items from each category). Below we will translate this problem into a problem of counting combinations without repetition, which can be solved by using a better understood formula that is known as the “binomial coefficient“.

First let us represent the 10 possible candies by 10 symbols ‘C’ and divide them into 4 categories by placing a partition wall, represented by a ‘+’ sign, between each sort of candy to separate them from each other

CC+CCCC+C+CCC

Note that there are 10 symbols ‘C’ and 3 partition walls, represented by a ‘+’ sign. That is, there are n-1+k = 13, equivalently n+k-1, symbols. Further note that each of the 3 partition walls could be in 1 of 13 positions. In other words, to represent various choices of 10 candies from 4 categories, the positions of the partition walls could be rearranged by choosing n-1 = 3 of n+k-1 = 13 positions

C++CCC+CCCCCC

CCCCCCCCCC+++

We have now translated the original problem into choosing 3 of 13 available positions.

Note that each position can only be chosen once. Further, the order of the positions does not matter. Since choosing positions {1, 4, 12} does separate the same choice of candies as the set of positions {4, 12, 1}. Which means that we are now dealing with combinations without repetition.

Calculating combinations without repetition can be done using the formula that is known as the binomial coefficient

n!/k!(n-k)!

As justified above, to calculate combinations with repetition, simply replace n with n+k-1 and k with n-1,

(n+k-1)!/(n-1)!((n+k-1)-(n-1))!

In our example above this would be (4+10-1)!/(4-1)!((4+10-1)-(4-1))! = 13!/3!10!. Which is equivalent to

(n+k-1)!/k!(n-1)!

because (4+10-1)!/10!(4-1)! = 13!/10!3! = 13!/3!10!, which is the same result that we got above.

Further reading

Tags:

Consider an agent A which assumes itself to make only correct decisions. Here an arbitrary decision is denoted d and correct is denoted C, where Cd is defined to be any decision (respectively set of decisions) maximizing expected utility according to an agent’s utility-function U. Therefore A assumes CA, where CA is the set of all decisions that A is capable of deciding that also belong to the set of all correct decisions Cd (∀d ∈ CA, d ∈ Cd).

Let one possible decision k be defined as ¬Cd (k := ¬Cd (¬Cd is true if decision d does not maximize the expected utility of agent A)).

If A ever decides k then this will falsify its assumption that it only makes correct decisions (CA) and hence prove itself to be incorrect (¬CA). But since A assumes itself to make only correct decisions it believes that it will never decide k. Therefore CA iff ¬k. Substituting ¬Cd for k yields CA iff (¬¬Cd iff Cd) (A is correct if and only if its decisions are correct).

Now assume that A decides k anyway (e.g. a cosmic ray causes a malfunction in its decision module). Since A assumes CA it follows that k must have been a correct decision (k → Ck). Substituting ¬Cd for k yields ¬Cd → C¬Cd, which is a contradiction, and in turn implies ¬CA (A is incorrect).

Tags: ,

For a basic introduction I suggest the free online course Intro to Statistics (st101) provided by Udacity. Especially Part 2, Unit 8-11. Units 8 and 9 teach you the basics of probability while unit 10 introduces you to Bayes Rule and unit 11 challenges you to program it using the Python programming language. All units make you solve lots of problems on your own. You can submit your solution to be verified.

A similar introduction is available via Khan Academy. Especially the videosProbability (part 7) and Probability (part 8) touch on Bayes’ Theorem.

Udacity ST101 and Probability @ Khan Academy followed or accompanied by Visualizing Bayes’ theorem and Eliezer Yudkowsky’s An Intuitive (and Short) Explanation of Bayes’ Theorem should get you started.

For more see the links listed in this post by Richard Carrier or see below:

Links

The term “Mind Projection Fallacy” was coined by the late great Bayesian Master, E. T. Jaynes, as part of his long and hard-fought battle against the accursed frequentists.  Jaynes was of the opinion that probabilities were in the mind, not in the environment – that probabilities express ignorance, states of partial information; and if I am ignorant of a phenomenon, that is a fact about my state of mind, not a fact about the phenomenon.

I remember (dimly, as human memories go) the first time I self-identified as a “Bayesian”. Someone had just asked a malformed version of an old probability puzzle…

You’ve probably seen the word ‘Bayesian’ used a lot on this site, but may be a bit uncertain of what exactly we mean by that.

Bayes’ theorem was the subject of a detailed article. The essay is good, but over 15,000 words long — here’s the condensed version for Bayesian newcomers like myself.

Bayes’ Theorem for the curious and bewildered; an excruciatingly gentle introduction.

This post is elementary: it introduces a simple method of visualizing Bayesian calculations. In my defense, we’ve had other elementary posts before, and they’ve been found useful; plus, I’d really like this to be online somewhere, and it might as well be here.

Everyday use of a mathematical concept.

I recently came up with what I think is an intuitive way to explain Bayes’ Theorem…

Bayes' theorem

A law of probability that describes the proper way to incorporate new evidence into prior probabilities to form an updated probability estimate. Bayesian rationality takes its name from this theorem, as it is regarded as the foundation of consistent rational reasoning under uncertainty. A.k.a. “Bayes’s Theorem” or “Bayes’s Rule”.

Eliezer Yudkowsky is on bloggingheads.tv with the statistician Andrew Gelman.

Several different points of fascination about Bayes…

When looking further, there is however a whole crowd on the blogs that seems to see more in Bayes’s theorem than a mere probability inversion…

Bayesian statistics is a system for describing epistemological uncertainty using the mathematical language of probability.
Bayesian probability is one of the most popular interpretations of the concept of probability.

Edwin T. Jaynes was one of the first people to realize that probability theory, as originated by Laplace, is a generalization of Aristotelian logic that reduces to deductive logic in the special case that our hypotheses are either true or false. This web site has been established to help promote this interpretation of probability theory by distributing articles, books and related material. As Ed Jaynes originated this interpretation of probability theory we have a large selection of his articles, as well as articles by a number of other people who use probability theory in this way…

Bayesian statistics is so closely linked with induction that one often hears it called “Bayesian induction.” What could be more inductive than taking a prior, gathering data, updating the prior with Bayes Law, and limiting to the true distribution of some parameter?

Gelman (of the popular statistics blog) and Shalizi point that, in practice, Bayesian statistics should actually be seen as Popper-style hypothesis-based deduction. The problem is intricately linked to the “taking a prior” above.

Or, how to recognize Bayes’ theorem when you meet one making small talk at a cocktail party.

Still, I’m sure Blogger won’t mind me using their resources instead. The basic idea is that there’s a distinction between true values x and measured values y. You start off with a prior probability distribution over the true values. You then have a likelihood function, which gives you the probability P(y|x) of measuring any value y given a hypothetical true value x.

In other words, What is so special about starting with a human-generated hypothesis? Bayesian methods suggest what I think is the right answer: To get from probabilistic evidence to the probability of something requires combining the evidence with a prior expectation, a “prior probability”, and human hypothesis generation enables this requirement to be ignored with considerable practical success.

Andrew Gelman recently responded to a commenter on the Yudkowsky/Gelman diavlog; the commenter complained that Bayesian statistics were too subjective and lacked rigor.  I shall explain why this is unbelievably ironic…

Maybe this kind of Bayesian method for “proving the null” could be used to achieve a better balance.

Bayesian brain is a term that is used to refer to the ability of the nervous system to operate in situations of uncertainty in a fashion that is close to the optimal prescribed by Bayesian statistics.


—————————————

P.S.

Expect this link collection to be permanently updated.

Please post a comment if you have something to add.

Tags: , , , , , , , , , ,