Friday, 23 December 2022

The Kolmogorov probability axioms and objective chance

Philosophy of Probability and Statistical Modelling by Mauricio Suárez [1] provides a historical overview of the philosophy of probability argues for the significance of this philosophy for well founded statistical modelling.  Within this wider scope, the book discusses, the themes: objective probability, propensities, and measurements. So, it has much in common with the themes of this blog.  Suarez defends a model of objective probability that disentangles propensity from single-case chance and the observed sequences of outputs that result in relative frequencies of outcomes.  This is close to what I argued for in Potentiality and probability, but there are differences that I will return to them in a future post.

Suarez's main theme is objective probability, but he expends a lot of effort on examining subjective probabilities. I have no doubt that subjective probability has its place alongside objective probability but ad hoc mixture of the two is to be avoided. Subjective probability has been zealously defended by de Finetti and Jaynes to the point of them attempting to eliminate objective probability altogether. I believe, however, the posts in this blog make a case for objective probability as an aspect of ontology. In this post I will focus on the Kolmogorov axioms of probability. Curiously, it is only in examining subjective probability that Suarez discusses the axioms of probability.

The axioms that Suarez states are as follows.

Let \(\{E_1 , E_2, …, E_n\}\) be the set of events over which an agent´s degrees of belief range; and let \(\Omega\) be an event which occurs necessarily. The axioms of probability may be expressed as follows:

Axiom 1: \(0 \le P (E) \le 1\), for any \(P (E)\): In other words, all probabilities lie in the real unit number interval.

Axiom 2: \(P (\Omega) =1\): The tautologous proposition, or the necessary event has probability one.

Axiom 3: If \(\{E_1, E_2, …, E_n\}\) are exhaustive and exclusive events, then \(P (E_1) + P(E_2) + … + P(E_n) = P (\Omega) = 1\): This is known as the addition law and is sometimes expressed equivalently as follows: If \(\{E_1, E_2, …, E_n \}\) is a set of exclusive (but not necessarily exhaustive) events then: \(P (E_1 \vee E_2 \vee … E_n) = P (E_1) + P(E_2) + … + P(E_n)\).

Axiom 4: \(P (E_1 \& E_2) = P (E_1 | E_2) P (E_2)\). This is sometimes known as the multiplication axiom, the axiom of conditional probability, or the ratio analysis of conditional probability since it expresses the conditional probability of \(E_1\) given \(E_2\).

According to Suarez, the Kolmogorov axioms are essentially equivalent to those above. The axioms that Kolmogorov published it in 1933 have become the standard formulation. The axioms themself form a short passage near the start of the book. 

 Nathan Morrison has translated the Kolmogorov axioms [2] as:

Let \(E\) be a collection of elements ξ, η, ζ, ..., which we shall call elementary events, and \(\mathfrak{F}\) a set of subsets of \(E\); the elements of the set \(\mathfrak{F}\) will be called random events

I. \(\mathfrak{F}\) is a field of sets. 

II. \(\mathfrak{F}\) contains the set \(E\).

III. To each set \(A\) in \(\mathfrak{F}\) is assigned a non-negative real number \(P(A)\). This number \(P(A)\) is called the probability of the event \(A\).

IV. \(P(E)\) equals \(1\).  

V. If \(A\) and \(B\) have no element in common, then 

$P(A+B) =P(A)+P(B)$

A system of sets \(\mathfrak{F}\), together with a definite assignment of numbers \(P(A)\), satisfying Axioms I-V, is called a field of probability

Terminology has moved on and it would now be usual to identify the tiple \((E, \mathfrak{F}. P)\) as the probability space with the term field reserved for the Borel field of sets \(\mathfrak{F}\). In addition, it is preferable not to use the same symbol for the addition of numbers and the union of sets. It is also important to note that ξ, η, ζ, ..., indicating the elements of \(E\) does not constrain the set of elementary events to be a countable set. This is important for applications to statistical and quantum physics where, for example, particle positions position and momentum can take a continuum of values. 

It is a shorthand when Kolmogorov writes in Axiom III that the number \(P(A)\) is the probability of the event \(A\). The full interpretation is that \(P(A)\) is the probability that the outcome will be an element in \(A\). This does seem to be the standard, if often implicit, understanding of the situation. In the same Axiom III, the use of the term "assigned" is also deceptive. The probabilities are more properly assigned to the singleton set with each element of \(E\) as the sole member and from that the probability for each set in\(\mathfrak{F}\) is constructed.

The salient differences between the axioms provided by Suarez and those of Kolmogorov are:
  • Suarez provides axioms only for a discrete finite set of events
    • What he calls events are, within the restriction in the bullet above are Kolmogorovs elementary events.
    • There is therefore no need to introduce the field of sets
  • Apart from event, Suarez uses the language and symbols of propositional logic  
    • As he is dealing with probabilities as credences (degrees of belief) in this section of his book it would have been better to consistently employ the language and symbols of propositional logic.
    • This would give a mapping
                                Logical                                    Set theoretical
                    elementary propositions                       elementary events
                    Logical 'and', \(\land\) or \(\&\)          Set intersection \(\cap\)
                      Logical 'or' \(\vee\)                           Set union \(\cup\) 
                    Tautology \(\Omega\)                        Set of elementary events
  • Axiom 1 should read:  \(0 \le P (E) \le 1\), for any \(E\). This axiom is not one of Kolmogorov's but can be derives from them.
    • Caution: in the Suarez version \(E\) is an arbitrary 'event' whereas in Kolmogorov this symbol is used for the set of elementary events.
  • Kolmogorov has no equivalent to Axiom 4 in his list but introduces the equivalent formula for conditional probability later, as a definition. I think it is better to list it as one of the axioms.
As a formal structure, it would have been even better if Kolmogorov had used rigorously the language of sets and not used a term like 'event'. A set theoretic formulation with potential possibilities and numerical probability assignment would then be:

Let \(E\) be a collection of elements ξ, η, ζ, ..., which we shall call elementary possibilities and \(\mathfrak{F}\) a set of subsets of \(E\). 

I. \(\mathfrak{F}\) is a field of sets. 

II. \(\mathfrak{F}\) contains the set \(E\).

III. To each set \(A\) in \(\mathfrak{F}\) is assigned a non-negative real number \(P(A)\). This number \(P(A)\) is called the probability of \(A\), an element of (\mathfrak{F}\).

IV. \(P(E)\) equals \(1\).  

V. If \(A\) and \(B\) in \(\mathfrak{F}\) have no element in common, then 

$P(A \cup B) =P(A)+P(B)$

                    VI. For any \(A\) and \(B\) in \(\mathfrak{F}\) 

                                    \(P (A | B) = \frac{P (A \cap B)}{P (B)}\), 

                          is the conditional probability

Such a system of sets \(\mathfrak{F}\), \(E\), together with a definite assignment of numbers \(P(A)\) for all \(A \in \mathfrak{F}\), satisfying Axioms I-V, is called a probability space.

I propose that this is a neutral set of axioms for a probability theory. It has one advantage that it can be applied to eventless situations such as in the quatum description of a free particle. The formualtion is based on the mathematics of measure theory plus a numerical probability assignment and the identification of a set of possibilities. I have added, as is often done, the definition of conditional probability as an axiom.  It is possible to map this formulation to one in terms of propositions and logical operators. This move would not restrict application and interpretation to subjective probability. That would be governed by the meaning given to the numerical probability assignment. The axiom system considered here take the numerical probability assignment as fundamental and the conditional probability is then added. It is also possible to take conditional probability as fundamental, as done by Renyi. I will discuss this formulation in a future post.

In adopting these axioms for objective chance, the collection of elements can again be called elementary events or possible events, and \(P\) is a numerical assignment furnished by a theory or estimated through experiment. The elements of (\mathfrak{F}\) are not, in general, outcomes in the way that the elements of \(E\) are. An element of (\mathfrak{F}\) is a set of some elements of \(E\). That this can useful can be made clear through two examples.

A simple physical example is provided by a die with six sides and in normal game playing circumstances this provides six possibilities for which face will face upwards when thrown. These six possibilities are the elementary events \(E\) in the probability space for a dice throwing game with one die. I will also call these elementary events outcomes. These outcomes are not in \(\mathfrak{F}\), the set of subsets of \(E\) that Kolmogorov calls events. For example, the outcome "face 5 faces upwards" is not in set of random events but the subset {"face 5 faces upwards"} is. On one throw of the die we can only obtain an element of \(E\) so what are the random events \(\mathfrak{F}\)? Consider \(P(E)\) that is equal to one. It is usual to interpret this as saying that \(E\) is the sure event. But in one throw of the die we will never get \(E\) but only one element of it. What is therefore sure is that any outcome will be in \(E\).  

So far, I have said little about \(P\) itself.  In the die example \(P(\){"face 5 faces upwards"}\()\)  can be estimated as the proportion of times it occurs in a long run of repetitions. For anyone familiar with the relative frequency interpretation of probability, I emphasise that this in not such an interpretation. Here the relative frequencies are an estimate of the numerical probability assignment. \(P(\){"face 5 faces upwards"}\()\) itself is the relative strength of the tendency for "face 5 faces upwards" to occur, otherwise known as the single-case chance for that event. If we consider and elements of \(\mathfrak{F}\) such that \(A = \{\)"face 1 faces upwards", "face 3 faces upwards", "face 5 faces upwards"\(\}\) then \(A\) can be interpreted as "an odd valued face faces upwards".  This illustrates that even in the application to objective chance that it is difficult to avoid the use of propositions to give meaning to useful subsets of \(\mathfrak{F}\). 

A strength of the Kolmogorov axioms and my reformulation is the application to continuous infinite sets. An example is the case of an observation of an electron governed by the Schrödinger equation. According to quantum mechanics the probability of it being observed at any pre-designated spot is zero. In this example, all the elements of the set of elementary events have numerical probability assignment zero. This is where the field of events \(\mathfrak{F}\) is useful in providing sets with non-zero probability in which the position of the electron may be observed.

As discussed above, it can be convenient to use proposition to give meaning to relevant elements of \(\mathfrak{F}\) in a specific application. However, this need not introduce any subjectivity. The subjectivity enters through interpreting \(P\) as credence or degree of belief.  In applications with objective probability, \(P(A)\) is a numerical assignment of the strength of the single-chance tendency for an elementary event to appear in set \(A\).  In objective probability \(P\) is ontological but in subjective probability it is epistemic.
  1. Mauricio Suárez, Philosophy of Probability and Statistical Modelling, Elements in the Philosophy of Science, Cambridge University Press, 2021
  2. Kolmogorov, AN. (2018) Foundations of the Theory of Probability. [edition unavailable]. Dover Publications. Available at: https://www.perlego.com/book/823610/foundations-of-the-theory-of-probability-second-english-edition-pdf (Accessed: 18 December 2022).

The heart of the matter

The ontological framework for this blog is from Nicolai Hartmann's  new ontology  programme that was developed in a number of very subst...