Wednesday, 11 January 2023

The Quantum Mechanics of Events, Histories and Trees

It is time to return to quantum mechanics. The approach I have been developing is a generalised probability theory were the quantum state sits on a complex of probability spaces. In my review post of October 2022 I referred to the work of Fröhlich and colleagues and their search for a fundamental theory of quantum mechanics. They call it ETH (Events, Trees, and Histories). Theirs is also an approach that proposes that quantum mechanics is fundamentally probabilistic and that it describes events and not just measurements. So, I will, over several posts, work through their theory to learn how some of the gaps in my own approach may be addressed. The picture below gives an early indiction how the concept of possibilities fit into the ETH scheme. An event is identified with the realisation of a possibility.


Illustration of ETH - Events, trees, and histories

It has been a theme of my posts to try and clarify the philosophical fundamentals, especially ontology, associated with a physical theory. So, I will start the review of ETH with the introduction to a paper in which Fröhlich sets out his "credo" for his endeavour [1]. His credo is:

  1. Talking of the “interpretation” of a physical theory presupposes implicitly that the theory has reached its final form, but that it is not completely clear, yet, what it tells us about natural phenomena. Otherwise, we had better speak of the “foundations” of the theory. Quantum Mechanics has apparently not reached its final form, yet. Thus, it is not really just a matter of interpreting it, but of completing its foundations.
  2. The only form of “interpretation” of a physical theory that I find legitimate and useful is to delineate approximately the ensemble of natural phenomena the theory is supposed to describe and to construct something resembling a “structure-preserving map” from a subset of mathematical symbols used in the theory that are supposed to represent physical quantities to concrete physical objects and phenomena (or events) to be described by the theory. Once these items are clarified the theory is supposed to provide its own “interpretation”. (A good example is Maxwell’s electrodynamics, augmented by the special theory of relativity.)
  3. The ontology a physical theory is supposed to capture lies in sequences of events, sometimes called “histories”, which form the objects of series of observations extending over possibly long stretches of time and which the theory is supposed to describe.
  4. In discussing a physical theory and mathematical challenges it raises it is useful to introduce clear concepts and basic principles to start from and then use precise and – if necessary – quite sophisticated mathematical tools to formulate the theory and to cope with those challenges.
  5. To emphasize this last point very explicitly, I am against denigrating mathematical precision and ignoring or neglecting precise mathematical tools in the search for physical theories and in attempts to understand them, derive consequences from them and apply them to solve concrete problems.
 where I have added the numbering for easy reference. Let's take them one by one.

  1. I agree completely with this comment although I may have lapsed occasionally into using the term "interpretation" loosely. So, a possibility to be investigated is that in addition to the standard formulation of quantum mechanics there may be an additional stochastic process that describes event histories.
  2. The use of "interpreted" in the second paragraph has to do with the scope and meaning of the theory. We have the physical quantities that are to be described or explained, their mathematical representation and then the various theoretical structures that can make use of these quantities in their mathematical representation. In this way it should be clear from the outset what the intended theory is about. It about things and their mathematical representation.
  3. This statement poses more of a problem. For me, the ontology has more to do with the concerns in the previous paragraph.  While I agree that there are events, there must be physical quantities that participate in these events. These quantities must also form part of the ontology. For example, atoms may be made up of more fundamental particles. The atoms and the more fundamental particle are part of the ontology, and it is part of the structure of the ontology that atoms are made up of electrons, protons, and neutrons. Neutrons and protons are made up of still more fundamental particles. I would also include fields and possible states of the physical objects in the ontology.
  4. Again, I agree. Feynman is known to have said that doing science is to stop us fooling ourselves. He was thinking primarily of comparing predictions with the outcome of experiments. However, mathematics also plays this role. By formulating rigorously the mathematics of a theory and following strictly the consequences we can avoid introducing implicit assumptions that make thing work out "all right" when they should not. When we get disagreement with experiment then we can be sure that it is the initial assumptions about objects or their mathematical representation that is at fault.
  5.  Frölich's reputation is as an especially rigorous mathematical physicist and not only philosophers, but many physicists take such a rigorous approach to the mathematics to be rigour for rigour's sake. While I do not claim his skills, I am more than happy to try ans learn form an approach that emphasises precise mathematics.

Within this "credo" Fröhlich and collaborators address:

  1. Why it is fundamentally impossible to use a physical theory to predict the future.
  2. Why quantum mechanics is probabilistic.
  3. The clarification of "locality" and "causality" in quantum mechanics.
  4. The nature of events.
  5. The evolution of states in space-time.
  6. The nature of space-time in quantum mechanics.
We will work our way through these topics in upcoming posts.

Reference
  1. Fröhlich, J. (2021). Relativistic Quantum Theory. In: Allori, V., Bassi, A., Dürr, D., Zanghi, N. (eds) Do Wave Functions Jump? . Fundamental Theories of Physics, vol 198. Springer, Cham. https://doi.org/10.1007/978-3-030-46777-7_19

Monday, 9 January 2023

Conditional probability: Renyi versus Kolmogorov

Four years ago, I wrote about Renyi's axiomisation of probability that, in contrast to that of Kolmogorov, takes conditional probability as the fundamental concept. It is timely to revisit the topic given my last post on Kolmogorov's axioms.  In addition, Suarez (whose latest book was also discussed in my last post) appears to endorse the Renyi axioms over those of Kolmogorov although only in a footnote. Stephen Mumford, Rani Lill Anjum and Johan Arnt Myrstad in the book What Tends to Be, Chapter 6 also follow their analysis of conditional probability in the Kolmogorov axiomisation by taking the view that conditional probabilities should not be reducible to absolute probabilities.  

In his Foundations of Probability (1969) Renyi provided an alternative axiomisation to that of Kolmogorov that takes conditional probability as the fundamental notion, otherwise he stays as close as possible to Kolmogorov. As with the Kolmogorov axioms, I shall replace reference to events with possibilities. 

Renyi's conditional probability space \((\Omega, \mathfrak{F} (, \mathfrak{G}, P(F | G))\) is defined as follows. 

The set \(\Omega\) is the space of elementary possibilities and \(\mathfrak{F}\) is a \(\sigma\)-field of subsets of \(\Omega\) and \(\mathfrak{G}\), a subset of \(\mathfrak{F}\) (called the set of admissible conditions) having the properties:
(a) \( G_1, G_2 \in \mathfrak{G} \Rightarrow G_1 \cup G_2 \in \mathfrak{G}\),
(b) \(\exists \{G_n\}\),  a sequence in \(\mathfrak{G}\), such that \(\cup_{n=1}^{\infty} G_n = \Omega,\)
 (c) \(\emptyset \notin \mathfrak{G}\),
\(P\) is the conditional probability function satisfying the following four axioms.
R0. \( P : \mathfrak{F} \times \mathfrak{G} \rightarrow [0, 1]\),
R1. \( (\forall G \in \mathfrak{G} ) , P(G | G) = 1.\)
R2. \((\forall G \in \mathfrak{G}) , P(\centerdot | G)\) , is a countably additive measure on \(\mathfrak{F}\).
R3. If \(\forall G_1, G_2 \in \mathfrak{G}, G_2 \subseteq G_1 \Rightarrow P(G_2 | G_1) > 0\), 
$$(\forall F \in \mathfrak{F}) P(F|G_2 ) = { \frac{P(F \cap G_2 | G_1)}{P(G_2 | G_1)}}.$$
Several problems have been examined by Stephen Mumford, Rani Lill Anjum and Johan Arnt Myrstad in What Tends to Be, Chapter 6, as part of a critique of the applicability of Kolmogorov's definition of conditional probability to the ontology of dispositions that tend to cause or hinder events. These have been analysed by them using Kolmogorov's absolute probabilities, but without a careful construction of the probability space appropriate for the application. These same examples will be analysed here using both Kolmogorov's and Renyi's formulation. 

The first example that indicates a problem with absolute probability (absolute probability will be denoted by \(\mu\) below to avoid confusion with Renyi's function \(P\), the \(\sigma\)-field, \(\mathfrak{F}\) is the same for both).

P1. For \(A, B \in \mathfrak{F}\), let \(\mu(A) = 1\) then \(\mu(A | B) =1\),  \(\mu\) is Kolmogorov's absolute probability

Strictly this holds only for sets \(B\) with \(\mu(B) \gt 0\). We can calculate this result from Kolmogorov's conditional probability postulate as follows: since 
$$\mu(A \cap B) = \mu(B),$$ 
$$\mu(A|B) = \mu(A \cap B)/\mu(B) = \mu(B)/\mu(B)=1.$$ 
This is not problematic within the mathematics but Mumford et al consider it to be if \(\mu(A|B)\) is to be understood as a degree of implication. They claim that there must exist a condition under which the probability of \(A\) decreases. They justify this through an example:
Say we strike a match and it lights. The match is lit and thus the (unconditional) probability of it being lit is one. Still, this does not entail that the match lit given that there was a strong wind. A number of conditions could counteract the match lighting, thus lowering the probability of this outcome. The match might be struck in water, the flammable tip could have been knocked off, the match could have been sucked into a black hole, and so on. 
Let us analyse this more closely. Let \(A =\) "the match lights". Then, "The match is lit and thus the (unconditional) probability of it being lit is one." is equivalent to \(\mu(A|A) = 1\). This is not controversial. They go on to bring other considerations into play and, intuitively, it seems evident that whether a match is lit or not will depend on the existing situation. For example, on whether it is wet or dry, windy, or not, and whether the match is stuck effectively.   But this enlarges the space the elementary possibilities. In this enlarged probability space, the set \(A\) labelled "the match lights" is

$$ A=\{(\textsf{"the match is lit", "it is windy", "it is dry", "match is struck"}\} \cup \\ \{(\textsf{"the match is lit", "it is not windy", "it is dry", "match is struck"}\} \cup  \\ \{(\textsf{"the match is lit", "it is windy", "it is not dry", "match is struck"}\} ......$$ 
where \(......\) indicates all the other subsets (elementary possibilities) that make up \(A\).

Each elementary possibility is now a 4-tuple and \(A\) is the union of all sets consisting of a single
 4-tuple in which the first item is "the match is lit".  Similarly, a set can be constructed to represent \(B=\) "match stuck". The probability function over the probability space is constructed from the probabilities assigned to each elementary possibility. An assignment can be made such that 
$$ \mu(A|B) =1 \textsf{ or } \mu(A| B^C) =0 $$
where \(B^C\) ("match not struck") is the complement of \(B\) .  It would not be a physically or causally feasible allocation of probabilities to have \( \mu(A| B^C) =1 \) whereas \( \mu(A| B^C) =0 \) is. Indeed, a physically valid allocation of probabilities should give \(\mu(A \cap B^C) = \mu(\emptyset) =0\).  All Kolmogorov probability assignments of elementary possibilities with "the match is lit" and "match not struck" in the 4-tuples of elementary possibilities should be zero. The Kolmogorov ratio formula for the conditional probability would apply in the case when all the conditions are accommodated in the set of elementary possibilities. Therefore, P1 is not a problem for the Kolmogorov axioms if the probability space is appropriately modelled.

Appropriate modelling is just as relevant when using the Renyi axioms. I addition, as we are working in the context of conditions influencing outcomes, we will not allow outcomes that cannot be influences to be in the set of admissible conditions \(\mathfrak{G}\). This has no effect on the analysis of P1 but, as will be discussed below, is important for modelling causal conditions.

A further problematic consequence of Kolmogorov's conditional probability, according to Mumford et al, is when \(A\) and \(B\) are probabilistically independent
P2. \(\mu (A\cap B)=\mu(A )\mu(B)\) implies \(\mu(A|B)=\mu(A).\)
This is indeed a consequence of Kolmogorov's definition. Renyi's formulation does not allow this analysis to be carried out, unless \(\mathfrak{G} = \mathfrak{F}\).  Mumford et al illustrate there concern through an example
The dispositional properties of solubility and sweetness are not generally thought to be probabilistically dependent.
Whatever is generally thought, the mathematical analysis will depend on the probability model. If two properties are probabilistically independent, then that should be captured in the model. However the objections of Mumford et al are combined with a criticism of Adam's Thesis
Assert (if B then A) = \(P\)(if B then A) =\(P\)(A given B) = \(P(A|B)\)

where \(P(A|B)\) is given by the Kolmogorov ratio formula. However, it should be remembered that the Kolmogorov ratio formula can be simply showing correlation and not that B causes A or that B implies A to any degree. I do not want to get into defending or challenging this thesis here but within the Renyi axiomisation the Kolmogorov conditional probability formula only holds under special conditions, see R3. Independence, in Renyi's scheme, is only defined with reference to some conditioning set, \(C\) say. In which case probabilistic independence is described by the condition

$$ P(A \cap B |C) = P(A|C)P(B|C)$$
and as a consequence, it is only if \(B \subseteq C\) that
$$P(A|B) =\frac{P(A \cap B | C)}{P(B | C)} = P(A|C)$$
This means that if a set \(D\) in \(\mathfrak{G}\) that is not a subset of \(C\) is used to condition \(A \cap B\) then, in general,
$$P(A \cap B |D) \neq P(A|D)P(B|D)$$
even if
$$P(A \cap B |C) = P(A|C)P(B|C).$$
This shows that in the Renyi axiomisation statistical independence is not an absolute property of two sets. 

The third objection is that regardless of the probabilistic relation between \(A\) and \(B\), a third consequence of the Kolmogorov conditional probability definition is that whenever the probability of \(A\) and \(B\) is high \(\mu(A|B)\) is high and so is \(\mu(B|A)\):
P3. \((\mu(A \cap B) \sim 1) \Rightarrow((\mu(A|B) \sim 1) \land \mu(B|A) \sim 1)).\)
If \(\mu(A \cap B) \sim 1\) then \(A \equiv B\) but for a set of measure zero. Then \(\mu(A) \sim 1\) and \(\mu(B) \sim 1\) that implies the statement in P3. Mumford et al object
The probability of the conjunction of ‘the sun will rise tomorrow’ and ‘snow is white’ is high. But this doesn’t necessarily imply that the sun rising tomorrow is conditional upon snow being white, or vice versa.
That may the case but the correlation between situations where both are the case is high. Once again, the problem is the identification of conditional probability with a degree of implication in Adam's Thesis. But it is well known that conditional probability may simply capture correlation. If we want to separate conditioning sets from other sets that are consequences in the \(\sigma\)-algebra generated by all elementary possibilities, then Renyi's axioms allow this. 

The Renyi equivalent of P3 is
$$ P(A \cap B|C) \sim 1 \Rightarrow (P(A|B) \sim 1, B \subseteq C) \land (P(B|A) \sim 1, A \subseteq C$$
 
It does holds, when both \(A\) and \(B\) are subsets of \(C\) but that is then a reasonable conclusion for the case of both \(A\) and \(B\) included in \(C\). However, if one of the sets is not a subset of C then it will not hold in general.  

When \(\mathfrak{G}\) is a smaller set than \(\mathfrak{F}\) it becomes useful for causal conditioning.  We can exclude sets from \(\mathfrak{G}\) that are outcomes and include sets that are causes. If we are interested in causes of snow be white, we will condition in facts of crystallography and local conditions that may turn snow yellow, as pointed out by Frank Zappa. 

For the earlier example above the set \(A\), "the match lights" would not be included in \(\mathfrak{G}\). So for \(C \in \mathfrak{G}, P(A|C)\) is a defined probability measure but \(P(C|A)\) is not.

The Kolmogorov axioms are good for modelling situations where measurable sets represent events of the same status. If there are reasons to believe that some sets have the status of causal conditions for other sets then they should be modelled with Renyi's axiomisation (or some similar axiomisation) as subsets of the set of admissible conditions.

The next question is whether adopting, and modelling fully, with the Renyi scheme allows a counter to objections such as those of Humphreys (Humphreys, P. (1985) ‘Why Propensities Cannot Be Probabilities’, Philosophical Review, 94: 557–70.) to using conditional probabilities to represent dispositional probabilities. 

Friday, 23 December 2022

The Kolmogorov probability axioms and objective chance

Philosophy of Probability and Statistical Modelling by Mauricio Suárez [1] provides a historical overview of the philosophy of probability argues for the significance of this philosophy for well founded statistical modelling.  Within this wider scope, the book discusses, the themes: objective probability, propensities, and measurements. So, it has much in common with the themes of this blog.  Suarez defends a model of objective probability that disentangles propensity from single-case chance and the observed sequences of outputs that result in relative frequencies of outcomes.  This is close to what I argued for in Potentiality and probability, but there are differences that I will return to them in a future post.

Suarez's main theme is objective probability, but he expends a lot of effort on examining subjective probabilities. I have no doubt that subjective probability has its place alongside objective probability but ad hoc mixture of the two is to be avoided. Subjective probability has been zealously defended by de Finetti and Jaynes to the point of them attempting to eliminate objective probability altogether. I believe, however, the posts in this blog make a case for objective probability as an aspect of ontology. In this post I will focus on the Kolmogorov axioms of probability. Curiously, it is only in examining subjective probability that Suarez discusses the axioms of probability.

The axioms that Suarez states are as follows.

Let \(\{E_1 , E_2, …, E_n\}\) be the set of events over which an agent´s degrees of belief range; and let \(\Omega\) be an event which occurs necessarily. The axioms of probability may be expressed as follows:

Axiom 1: \(0 \le P (E) \le 1\), for any \(P (E)\): In other words, all probabilities lie in the real unit number interval.

Axiom 2: \(P (\Omega) =1\): The tautologous proposition, or the necessary event has probability one.

Axiom 3: If \(\{E_1, E_2, …, E_n\}\) are exhaustive and exclusive events, then \(P (E_1) + P(E_2) + … + P(E_n) = P (\Omega) = 1\): This is known as the addition law and is sometimes expressed equivalently as follows: If \(\{E_1, E_2, …, E_n \}\) is a set of exclusive (but not necessarily exhaustive) events then: \(P (E_1 \vee E_2 \vee … E_n) = P (E_1) + P(E_2) + … + P(E_n)\).

Axiom 4: \(P (E_1 \& E_2) = P (E_1 | E_2) P (E_2)\). This is sometimes known as the multiplication axiom, the axiom of conditional probability, or the ratio analysis of conditional probability since it expresses the conditional probability of \(E_1\) given \(E_2\).

According to Suarez, the Kolmogorov axioms are essentially equivalent to those above. The axioms that Kolmogorov published it in 1933 have become the standard formulation. The axioms themself form a short passage near the start of the book. 

 Nathan Morrison has translated the Kolmogorov axioms [2] as:

Let \(E\) be a collection of elements ξ, η, ζ, ..., which we shall call elementary events, and \(\mathfrak{F}\) a set of subsets of \(E\); the elements of the set \(\mathfrak{F}\) will be called random events

I. \(\mathfrak{F}\) is a field of sets. 

II. \(\mathfrak{F}\) contains the set \(E\).

III. To each set \(A\) in \(\mathfrak{F}\) is assigned a non-negative real number \(P(A)\). This number \(P(A)\) is called the probability of the event \(A\).

IV. \(P(E)\) equals \(1\).  

V. If \(A\) and \(B\) have no element in common, then 

$P(A+B) =P(A)+P(B)$

A system of sets \(\mathfrak{F}\), together with a definite assignment of numbers \(P(A)\), satisfying Axioms I-V, is called a field of probability

Terminology has moved on and it would now be usual to identify the tiple \((E, \mathfrak{F}. P)\) as the probability space with the term field reserved for the Borel field of sets \(\mathfrak{F}\). In addition, it is preferable not to use the same symbol for the addition of numbers and the union of sets. It is also important to note that ξ, η, ζ, ..., indicating the elements of \(E\) does not constrain the set of elementary events to be a countable set. This is important for applications to statistical and quantum physics where, for example, particle positions position and momentum can take a continuum of values. 

It is a shorthand when Kolmogorov writes in Axiom III that the number \(P(A)\) is the probability of the event \(A\). The full interpretation is that \(P(A)\) is the probability that the outcome will be an element in \(A\). This does seem to be the standard, if often implicit, understanding of the situation. In the same Axiom III, the use of the term "assigned" is also deceptive. The probabilities are more properly assigned to the singleton set with each element of \(E\) as the sole member and from that the probability for each set in\(\mathfrak{F}\) is constructed.

The salient differences between the axioms provided by Suarez and those of Kolmogorov are:
  • Suarez provides axioms only for a discrete finite set of events
    • What he calls events are, within the restriction in the bullet above are Kolmogorovs elementary events.
    • There is therefore no need to introduce the field of sets
  • Apart from event, Suarez uses the language and symbols of propositional logic  
    • As he is dealing with probabilities as credences (degrees of belief) in this section of his book it would have been better to consistently employ the language and symbols of propositional logic.
    • This would give a mapping
                                Logical                                    Set theoretical
                    elementary propositions                       elementary events
                    Logical 'and', \(\land\) or \(\&\)          Set intersection \(\cap\)
                      Logical 'or' \(\vee\)                           Set union \(\cup\) 
                    Tautology \(\Omega\)                        Set of elementary events
  • Axiom 1 should read:  \(0 \le P (E) \le 1\), for any \(E\). This axiom is not one of Kolmogorov's but can be derives from them.
    • Caution: in the Suarez version \(E\) is an arbitrary 'event' whereas in Kolmogorov this symbol is used for the set of elementary events.
  • Kolmogorov has no equivalent to Axiom 4 in his list but introduces the equivalent formula for conditional probability later, as a definition. I think it is better to list it as one of the axioms.
As a formal structure, it would have been even better if Kolmogorov had used rigorously the language of sets and not used a term like 'event'. A set theoretic formulation with potential possibilities and numerical probability assignment would then be:

Let \(E\) be a collection of elements ξ, η, ζ, ..., which we shall call elementary possibilities and \(\mathfrak{F}\) a set of subsets of \(E\). 

I. \(\mathfrak{F}\) is a field of sets. 

II. \(\mathfrak{F}\) contains the set \(E\).

III. To each set \(A\) in \(\mathfrak{F}\) is assigned a non-negative real number \(P(A)\). This number \(P(A)\) is called the probability of \(A\), an element of (\mathfrak{F}\).

IV. \(P(E)\) equals \(1\).  

V. If \(A\) and \(B\) in \(\mathfrak{F}\) have no element in common, then 

$P(A \cup B) =P(A)+P(B)$

                    VI. For any \(A\) and \(B\) in \(\mathfrak{F}\) 

                                    \(P (A | B) = \frac{P (A \cap B)}{P (B)}\), 

                          is the conditional probability

Such a system of sets \(\mathfrak{F}\), \(E\), together with a definite assignment of numbers \(P(A)\) for all \(A \in \mathfrak{F}\), satisfying Axioms I-V, is called a probability space.

I propose that this is a neutral set of axioms for a probability theory. It has one advantage that it can be applied to eventless situations such as in the quatum description of a free particle. The formualtion is based on the mathematics of measure theory plus a numerical probability assignment and the identification of a set of possibilities. I have added, as is often done, the definition of conditional probability as an axiom.  It is possible to map this formulation to one in terms of propositions and logical operators. This move would not restrict application and interpretation to subjective probability. That would be governed by the meaning given to the numerical probability assignment. The axiom system considered here take the numerical probability assignment as fundamental and the conditional probability is then added. It is also possible to take conditional probability as fundamental, as done by Renyi. I will discuss this formulation in a future post.

In adopting these axioms for objective chance, the collection of elements can again be called elementary events or possible events, and \(P\) is a numerical assignment furnished by a theory or estimated through experiment. The elements of (\mathfrak{F}\) are not, in general, outcomes in the way that the elements of \(E\) are. An element of (\mathfrak{F}\) is a set of some elements of \(E\). That this can useful can be made clear through two examples.

A simple physical example is provided by a die with six sides and in normal game playing circumstances this provides six possibilities for which face will face upwards when thrown. These six possibilities are the elementary events \(E\) in the probability space for a dice throwing game with one die. I will also call these elementary events outcomes. These outcomes are not in \(\mathfrak{F}\), the set of subsets of \(E\) that Kolmogorov calls events. For example, the outcome "face 5 faces upwards" is not in set of random events but the subset {"face 5 faces upwards"} is. On one throw of the die we can only obtain an element of \(E\) so what are the random events \(\mathfrak{F}\)? Consider \(P(E)\) that is equal to one. It is usual to interpret this as saying that \(E\) is the sure event. But in one throw of the die we will never get \(E\) but only one element of it. What is therefore sure is that any outcome will be in \(E\).  

So far, I have said little about \(P\) itself.  In the die example \(P(\){"face 5 faces upwards"}\()\)  can be estimated as the proportion of times it occurs in a long run of repetitions. For anyone familiar with the relative frequency interpretation of probability, I emphasise that this in not such an interpretation. Here the relative frequencies are an estimate of the numerical probability assignment. \(P(\){"face 5 faces upwards"}\()\) itself is the relative strength of the tendency for "face 5 faces upwards" to occur, otherwise known as the single-case chance for that event. If we consider and elements of \(\mathfrak{F}\) such that \(A = \{\)"face 1 faces upwards", "face 3 faces upwards", "face 5 faces upwards"\(\}\) then \(A\) can be interpreted as "an odd valued face faces upwards".  This illustrates that even in the application to objective chance that it is difficult to avoid the use of propositions to give meaning to useful subsets of \(\mathfrak{F}\). 

A strength of the Kolmogorov axioms and my reformulation is the application to continuous infinite sets. An example is the case of an observation of an electron governed by the Schrödinger equation. According to quantum mechanics the probability of it being observed at any pre-designated spot is zero. In this example, all the elements of the set of elementary events have numerical probability assignment zero. This is where the field of events \(\mathfrak{F}\) is useful in providing sets with non-zero probability in which the position of the electron may be observed.

As discussed above, it can be convenient to use proposition to give meaning to relevant elements of \(\mathfrak{F}\) in a specific application. However, this need not introduce any subjectivity. The subjectivity enters through interpreting \(P\) as credence or degree of belief.  In applications with objective probability, \(P(A)\) is a numerical assignment of the strength of the single-chance tendency for an elementary event to appear in set \(A\).  In objective probability \(P\) is ontological but in subjective probability it is epistemic.
  1. Mauricio Suárez, Philosophy of Probability and Statistical Modelling, Elements in the Philosophy of Science, Cambridge University Press, 2021
  2. Kolmogorov, AN. (2018) Foundations of the Theory of Probability. [edition unavailable]. Dover Publications. Available at: https://www.perlego.com/book/823610/foundations-of-the-theory-of-probability-second-english-edition-pdf (Accessed: 18 December 2022).

Friday, 25 November 2022

Causation and chance in quantum mechanics

The competing and complementary concepts of causality within the dispositional or powers approach is already quite intricate, as indicated by a previous post.  There are also concepts of causality that do not follow a dispositional approach and are in fact better known.

For a concise and balanced overview of the status see Anjum and Mumford Causality, Oxford 2014. My previous post provided a commentary on Chapter 3 of "What Tends to Be the Philosophy of Dispositional Modality", which examined a dispositional ontological for objective probability. In Chapter 4 of the same book Anjum, Mumford and Andersen examine how a dispositional theory of causality stands up against the ontological challenges of quantum mechanics. 

 The understanding of causality that is still the most influential can be traced back to David Hume's sceptical analysis and Emmanuel Kant's response. In A Treatise of Human Nature (1973, Book I, Part III, Section VI) Hume argued that all that can be observed in nature is a series of events. One thing happens and then another, and then another, and so on. Whether any of those events are causally connected is not itself part of experience of events. For example, a match is struck and then almost immediately that same match lights, but what is not observed is that the striking of the match caused it to light. If there is cause and effect, then it is not known through direct observation of events. But what do our physical theories tells us? In Newtonian physics one thing happens followed by another but that following is determined by the laws of physics. The state of the world at one time determines the state of the world later. That this seems not to be the case in a world described by quantum mechanics leads to claims that causality no longer holds. In standard quantum mechanics a later state does follow deterministically from an earlier state but measurement breaks the causal link.

Heisenberg and Bohr (Walter Heitler and Léon Rosenfeld in the background), in the mid 1930s 

Now let's look at those features of classical causation that were explicitly discussed by Bohr and Heisenberg: necessitation, determinism, predictability, and separability.

Classical causation

Hume was sceptical about a necessary connection between cause and effect because none can be known from experience. However, the intuition that a cause necessitates an effect is strong and led Kant to formulate causal necessity as a precondition for science.  It may seem strange that the radical empiricist, Hume, is seen to be undermining the scientific enterprise but his analysis, according to Kant, denies the possibility of scientific explanation. Kant, in the Critique of Pure Reason, states:

[T]he very concept of a cause so obviously contains the concept of a necessity of connection with an effect and a strict universality of rule that it would be entirely lost if one sought, as Hume did, to derive it from a frequent association of that which happens with that which precedes and a habit (thus a merely subjective necessity) of connecting representations arising from that association.
Kant's influence was strong, especially in Germany, and Heisenberg felt compelled to respond given the empirical success of quantum mechanics. In the lectures "Physics and Philosophy" he uses the radioactive decay of the radium atom as an example in which there is no predictability of the timing of a decay event.  There is no decay event that follows through necessity from the prior state of the atom. There is no quantum law that necessarily connects the state of the radium atom with the decay event. However, quantum theory does predict the probability of a decay event per unit time. There is no determinacy. If necessity and determinism are characteristic of a cause, then causality does not hold in quantum mechanics. 

Heisenberg, in agreement with Bohr argues that causality is needed but only as a classical concept for interpreting experiments. This is because we do not directly observe the decay but use a detector and it is taken as a rule that the decay causes a chain of events that result in what is detected or measured. 

The question remains whether necessitation and determinism are characteristic of all theories of causality. A further question is whether it is coherent to have classical causal laws holding in interpreting measurements but not in atomic physics.

Without determinism and necessitation at the fundamental level, the same event would not always follow from two identical sets of initial conditions or states. Predictions in the quantum realm, prior to detection, are probabilistic in general and this provides a reason for rejecting a classical causal interpretation of quantum mechanics.

The connection between determinism and prediction is that the former provides the metaphysical ground for the later.

In addition, the classical theory of causation has the cause and the effect as two distinct things. How distinct must these two things be? The predictions of quantum mechanics, in addition to being indeterministic, do not, according to Bohr (Causality and Complementarity, Philosophy of Science, Vol. 4, No. 3, 1937), allow the separation of the microscopic quantum system and the measuring apparatus. This gives rise to the notion that the act of measurement is in part responsible for the measured result. Cause and effect cannot be separated and so there is no clear demarcation between them. Classical causality breaks down again.

In the same paper Bohr insists that "the concept of causality underlies the very interpretation of each result of experiment, and that even in the coordination of experience one can never, in the nature of
things, have to do with well-defined breaks in the causal chain."

So, for Bohr there is a theory of atomic and sub-atomic physics, quantum mechanics, but there is no separable cause and effect in that theory. After an event is detected by the measuring apparatus then classical cause and effect come into play.

Potentiality

Heisenberg was not content to leave atomic physics inexplicable and so proposed an ontology at the atomic level with substance in isolation understood as pure potentiality
All the elementary particles are made of the same substance, which we may call energy or universal matter; they are just different forms in which matter can appear. If we compare this situation with the Aristotelian concepts of matter and form, we can say that the matter of Aristotle, which is mere ‘potentia’, should be compared to our concept of energy, which gets into ‘actuality’ by means of the form, when the elementary particle is created. (Heisenberg 1959)
Every time a potency gets actualised, causation happens and that is due to isolated substance encountering actualised matter. In mentioning particle creation Heisenberg has moved from quantum mechanics to field theory and his claim 'poentia' are energy like can only be sustained in an Aristotelian sense that has little to do with the concept in physics. As will be explained below, potentiality can be introduced within quantum mechanics using the theory outlined in the previous post. The idea is that objects behave the way they do, not because of some external laws that determine what happens to them, but because of their own intrinsic dispositions and their interactions. Let's follows Anjum and Mumford and call such a theory of causality neo-Aristotelian.

Neo-Aristotelian causality does not invoke necessitation. Instead, there are irreducible tendencies. It adopts dispositional modality rather than conditional necessity. A tendency is less than necessity, so the effect is not guaranteed by its cause.

As described in the previous post, a typical example of a probabilistic disposition is the 50:50 propensity of a fair coin to land heads or tails if tossed, while a non-probabilistic disposition could be the propensity of a vase to break if dropped onto a firm surface.

Neo-Aristotelian causation is not a relation between two separate events or objects, but is a continuous, unified process that typically takes time to unfold. One way to understand an event as stochastic is that there is some objective probabilistic element involved. In other words, a chance event. This is an ontological interpretation of probability, which contrasts with the purely epistemic notion of credence or subjective probability. An individual event could therefore still be caused, in the neo-Aristotelian sense, even if it is random to some degree and not predictable. Typically, this type of causation takes place when the possibilities exist, when the effect is enabled by the right stimulus and under the right conditions. Neo-Aristotelian causation happens once the disposition manifests itself. 

An object with the potentiality to manifest as possible events will find itself in a situation where there are enabling dispositions and interfering dispositions. For example, firewood is disposed to ignite but the process leading up to it burning requires the presence of manifestation partners: a suitable site, proper ignition, enough oxygen, and so on, and that inhibitors such as dampness are not too strong. The firewood, its enablers and its inhibitors are all active in the cause of the effect. No ontological distinction is drawn between properties belonging to the object undergoing change and the contextual properties in this process. They are all, in a general sense, causes of the specific outcome. Depending on the balance the firewood may
  • Burn brightly and sustainably
  • Burn but go out quickly
  • Smolder and smoke
  • Fail to light at all.

In summary, neo-Aristotelian causation
  • Involves irreducible tendencies
  • Is not deterministic, although some process can be close to deterministic
  • Supports predictions of what tends to happen, not what will happen with certainty
  • Is a unified process, not a relationship between two distinct events.

A sketch of quantum causation

Take the electron as a typical quantum object. We need to identify what possibilities it can potentially manifest and what is needed to enable that manifestation.

The possible manifestations of the electron are as values of position, momentum, spin, charge, and mass. At this level of description charge and mass are classical properties. The others are potential properties and from quantum theory there is structure to and constraints on how the other properties can appear. This is captured in the formulation of quantum mechanics favoured in this blog that uses the generalisation of probability to a \(\sigma\)-complex.

As indicated, at this stage I only propose a sketch of how this may work. Developing the detail and confirming whether the proposed process is correct will require much more work.

For the potentiality to manifest the electron cannot be isolated. It therefore interacts with other objects. It is proposed that the interaction selects one \(\sigma\)-algebra from the complex. That is, the context will be for certain spin values or position or momentum to manifest. This provides a preliminary selection of a \(\sigma\)-algebras from the complex and therefore standard probability description of the tendency of the electron properties to manifest. Now this manifestation takes place via a Monte-Carlo selection process from the probability distribution associated with the selected \(sigma\)-algebra. This provides a model of the causal chain from potentiality to actuality in the quantum domain.

To make this more concrete consider the analysis of the double slit experiment.  In the initial version discussed previously the slits split the quantum state to produce an interference effect. The context is already selecting position and therefore one probability distribution from those possible. At the detecting screen the position of the electron is made actual through Monte-Carlo selection from the probability distribution.

Now consider the addition of a pointer immediately after the double slit. The pointer tends to point towards the electron. The pointer description is purely quantum and is not a measurement apparatus. As my earlier analysis shows the presence of the pointer eliminates the interference effect and selects a probability distribution for the electron position that is a normalised sum of the distributions that would obtain if the electron only passed through one slit. Again, the context has selected the probability distribution over position and at the detecting screen the position of the electron is made actual with a Monte-Carlo selection from the probability distribution.

Wednesday, 16 November 2022

Potentiality and probability

As outlined in the previous post, Barbara Vetter (Potentiality from Dispositions to Modality Oxford University Press, 2015) developed the concept of potentiality in her theory of dispositional powers. In that theory potentials are dispositions that are responsible for the manifestation of possibilities. The possibilities then tend to become actual events or states of affairs. The concept of 'potential' in philosophy, in a sense close to that discussed here, goes back at least to Aristotle in Metaphysics Book \(\Theta\).

In contrast probabilities are weightings summing to one that describe in what proportion the possibilities tend to appear. I propose that the potential underpins the actual appearance of the possibilities while probability shapes it. This will be discussed further in this post. 

Barbara Vetter proposed a formal definition of possibility in terms of potentiality:

POSSIBILITY:  It is possible that \(p =_{def}\) Something has an iterated potentiality for it to be the case that \(p\).

 So, it is further proposed that the probabilities are the weights that can be measured through this iteration using the frequency of appearances of each possibility. Note that this indicates how probabilities can be measured but it is not a definition of probability.

In the field of disposition research there is an unfortunate proliferation of terms meaning roughly the same thing. The concept of 'power' brings out a disposition's causal role but so does 'potential'. As technical terms in the field both are dispositions. Now 'tendency' will also be introduced, and it is often used as yet another flavour of disposition.

 Tendencies

Barbara Vetter mentions tendencies in passing in her 2015 book on potentiality and, although she discusses graded dispositions, tendencies are not a major topic in that work. In "What Tends to Be the Philosophy of Dispositional Modality" Rani Lill Anjum and Stephen Mumford (2018) provide an examination of the relationship between dispositions and probabilities while developing a substantial theory of dispositional tendency. In their treatment powers are understood as disposing towards their manifestations, rather than necessitating them. This is consistent with Vetter's potentials. Tendencies are powers that do not necessitate manifestations but nonetheless the power will iteratively cause the possibility to be the case.

In common usage a contingency is something that might possibly happen in the future.  That is, it is a possibility. A more technical but still common view is that contingency is something that could be either true or false. This captures an aspect of possibility, but not completely because there is no role for potentially; something responsible for the possibilities.  There is also logical possibility in which anything that does not imply a contradiction is logically possible. This concept may be fine for logic but in this discussion, it is possibilities that can appear in the world that are under consideration. Here an actual possibility needs a potentiality to tend to produce it.

Example (adapted from Anjum and Mumford)

Struck matches tend to light. Although disposed to light when struck, we all know that there is no guarantee that they will light as there are many times a struck match fails to light. But there is an iterated potentiality for it to be the case that the match lights. The lighting of a struck match is more than a mere possibility or a logical possibility. There are many mere possibilities towards which the struck match has no disposition - that is no potential in the match towards struck matches melting, for instance.

Iterated potentiality provides the tendency for possible outcomes to show some patterns in their manifestation. In very controlled cases the number of cases of success in iterations of match striking could provide a measure of the strength of the power that is this potentiality. This would require a collection of matched that are essentially the same.

Initial discussion of probability

Anjum and Mumford introduce their discussion of probability through a simple example that builds on a familiar understanding of dispositional tendencies associated with fragility.

"The fragility of a wine glass, for instance, might be understood to be a strong disposition towards breakage with as much as 0.8 probability, whereas the fragility of a car windscreen probabilities its breaking to the lesser degree 0.3. Furthermore, it is open to a holder of such a theory to state that the probability of breakage can increase or decrease in the circumstances and, indeed, that the manifestation of the tendency occurs when and only when its probability reaches one." 

This example is merely an introduction and needs further development but already the claim that "the manifestation of the tendency occurs when and only when its probability reaches one" shows that it is not a model for objective probability. What is needed is a theory of dispositions that explains stable probability distributions. Of course, if the glass is broken then the probability of it being broken is \(1\). However, this has nothing to do with the dispositional tendency to break. What is needed is a systemic understanding of the relationship between the strength of a dispositional tendency and the values or, in the continuous case, shape of a probability distribution.

In the quoted example above each power is to be understood in terms of a probability of the occurrence of a certain effect, which is given a specific value. The fragility of a wine glass, for instance, might be understood to be a strong disposition towards breakage with as much as 0.8 probability, whereas the fragility of a car windscreen is less, and the probability of its breaking is a lesser degree 0.3. But given a wine glass or windscreen produced to certain norms and standards it would be expected that the disposition towards breakage would be quite stable. A glass with a different disposition would be a different glass.

Anjum and Mumford claim, that in some understandings the manifestation of a possibility occurs when and only when its probability reaches one (see Popper, "A World of Propensities", 1990: 13, 20). This is a misunderstanding of how probability works. Popper distinguished clearly between the mathematical probability measure and what he called the physical propensity, which is more like a force, but Popper does limit a propensity to have a strength of at most \(1\).   As I will attempt to show below, Popper in proposing propensity interpretation of objective probabilities oversimplifies the relationship between dispositions and probabilities. This confusion led Humphreys to draft a paper (The Philosophical Review, Vol. XCIV, No. 4 (October 1985)) to show that propensities cannot be probabilities. As indeed they are not. They are dispositions. That would leave open the proposition that probabilities are dispositional tendencies, but that also will turn out to be untenable.

The proposal by Anjum and Mumford that powers can over dispose does seem to be sound. Over disposing is where there is a stronger magnitude than what is minimally needed to bring about a particular possibility. This indicates that there is a difference between the notion of having a power to some degree and the probability of the power’s manifestation occurring. Among other conclusions, this also shows that the dispositional tendency does not reduce to probability, preserving its status as a potential. 

 Anjum and Mumford continue the discussion using 'propensity' as having a tendency to some degree, where degree is non-probabilistically defined.  Anjum and Mumford use the notions of ‘power’, ‘disposition’ and ‘tendency’ more or less interchangeably, whereas an object may have a power to a degree there are powers that are simply properties. In what follows I try to will eliminate the use of 'propensity', except where commenting of the usage of others, and use 'tendency' to qualify either 'power', 'potential' or 'disposition' rather than let it stand on its own.

A probability always takes a value within a bounded inclusive range between zero and one. If probability is \(1\) then probability theory stipulates that it is almost certain (occurs except for a set of cases of measure zero). In contrast to what Anjum and Mumford claim it is not natural to interpret this as necessity because there are exceptions. For cases where there are only a finite set of possibilities then probability \(1\) does mean that there are no exceptions. But as this is a special case in applied probability theory there is no justification in equating it with logical or metaphysical necessity.

A power must be strong enough to reach the threshold to make the possibilities actual.  Once the power is strong enough then the probability distribution over the possibilities may be stable or affected by other aspects of the situation. So, instead of understanding powers and their degrees of strength as probabilistic, powers and their tendencies towards certain manifestations are the underpinning grounds of probabilities.  Consider the example of tossing a coin.

A coin when tossed has the potential to fall either heads or tails. This tendency to fall either way can be made symmetric and then the coin is 'fair'. From which probability weightings of \(1/2\) for each outcome (taking account of the tossing mechanism) can be assumed and then confirmed by measuring the proportion of outcomes on iteration. The reason why the head and the tail events are equally probable statistically, when a fair coin is tossed, is that the coin is equally disposed towards those two outcomes due to its physical constitution. The probability weightings derive, in this example, from a symmetry in the potentiality, which in turn derives from the physical composition and detailed geometry of the coin.


https://en.m.wikipedia.org/wiki/Rai_stones)
https://en.m.wikipedia.org/wiki/Rai_stones)

Consider a society that uses very large stone discs as currency.  On examination of the disc, it would be possible to conject that if it were tossed then there would be two possible outcomes and that those outcomes are equally likely. But this disposition is not realised because of the effort required to construct the tossing mechanism, as such a stone may weigh several metric tons. The enabling disposition that would give rise to the iteration of possibilities would have been this missing tossing mechanism. It is not a property of the disc. The manifestation of the dispositional tendency of the disc to come to lie in one of two states needs an external mechanism that is disposed through design to toss the coin in a certain way. If the mechanism is constructed it may be too weak. It may tend to only flip the coin once giving a sequence such as 

... T H T H T H H T H T H T H T H T H T H ...

that would give a frequency of T close to \(1/2\) but the sequence does not exhibit the potential for random outcomes to which the coin disposed. 

 Probabilities and chance

From the above: potential and possibility are more fundamental than (or prior to) probability. Both are needed to construct and explain objective probability. The alternative, subjective probability, is based on beliefs about possibilities but that is not the same thing as what is actually possible and how things will appear independently of anyone's beliefs or judgements.

In this blog I have already referred to a dispositional tendency begin to explain objective probabilities in quantum mechanics. The term propensity has been used to describe these probabilities. I now think that was wrong. Propensity should be reserved for the dispositional tendency that is responsible for the probabilities to avoid this term merging the underpinning dispositional elements and probability structure. Anjum and Mumford claim that they have made a key contribution to clarifying the relationship between dispositional tendencies and probability through their analysis of over disposition

Anjum and Mumford claim "information is lost in the putative conversion of propensities to probabilities" but only if the dispositional grounding of probabilities is forgotten.  Their discussion is strongly influences by their interest in application to medical evidence where a major goal is reduction of uncertainty.  Anjum and Mumford propose two rules on how dispositions and probability relate.


  1. The more something disposes towards an effect \(e\), the more probable is \(e\), ceteris paribus; and the more something over disposes \(e\), the closer we approach probability \(P(e) =1\).
  2. There is a nonlinear ‘diminishing return’ in over disposing. E.g., if over disposing \(e\) by a magnitude \(x2\) produces a probability \(P(e) =0.98\), over disposing \(x3\) might increase that probability ‘only’ to \(P(e) =0.99\), and over disposing \(x4\) ‘only’ to \(P(e) =0.995\), and so on.

While these rules are fine as propositions, they miss the mark in explaining the relationship between dispositions and probability. In the coin tossing example strengthening the mechanism is not about strengthening one outcome. Over disposing does provide support for the distinction between the strength of the disposition and value of the probability but the relationship between underpinning potentials, dispositional mechanisms, and the iterated outcomes needs to be made clear.

Anjum and Mumford also discuss coin tossing and make substantially the same points as I made above. However, having clarified the distinction between propensity and probability, they revert to using the term propensity in a way that risks confusing the concepts of dispositional tendency and probabilities with random outcomes. They say "50% propensity" rather than "50% probability". They then introduce the term "chance" that they relate to outcomes in some specified situations. Propensity is then reserved by them for potential probability while chance is the probability of an outcome in a situation. This is more confusing than helpful.

Anjum and Mumford go on to a discussion of radioactive decay that is known to be described by quantum theory. They make no mention of quantum theory (this will be corrected by them in Chapter 4) and strangely claim that radioactive decay is not probabilistic.   The probability distributions derived from quantum mechanics unambiguously give the probability of decay per unit time. There are, per unit time, two possibilities "decay" or "no decay". Their error is to claim, "only one manifestation type" (decay) and from this that there is only one possibility. Ignoring quantum mechanics, they write:

 "The reason it is tempting to think of radioactive decay as probabilistic is that there is certainly a distinct tendency to decay that varies in strength for different kinds of particles, where that strength is specified in terms of a half-life (the time at which there is a 50/50 chance of decay having occurred)."

 But no, the reason to think that radioactive decay is probabilistic is that our best theory of nuclear phenomena explains it in terms of probabilities. This misunderstanding leads them to introduce the concept of indeterministic propensities. However, they have arrived at the concept it is left open as to whether there are non-probabilistic indeterminate powers, but radioactive decay is not an example.

The examples of the concept chance provided by Anjum and Mumford can be derived in their examples from a correct application of probability theory. Chance is often used as a term for 'objective probability', and I have done so in previous posts. I will continue to follow that usage and exploit the clarification obtained from the analysis above that shows that objective probability depends on the possibilities that are properties of an object. The manifestation of these possibilities may require an enabling mechanism.    The statistical regularities displayed by these manifestations on iteration are due primarily to the physical properties of the object unless the enabling mechanism is badly designed.

Thet term 'propensity' has given rise to much confusion in the literature. Now that we are reaching an explanation of objective probability the term 'propensity' might better be avoided. 

I propose that the model of objective probability is that:

OBJECTIVE PROBABILITY An object has probabilistic properties if it is physically constituted so that it has a potential to manifest possibilities that show statistical regularities.

It it possible to describe statistical regularities without invoking the term 'probability'.

Although some criticism of Anjum and Mumford is implied here, I recognise that their contribution has done much to disentangle considerations about the strength of dispositions that describe tendencies form a direct interpretation as probabilities. However, the value of the three distinctions they have identified is mixed

  • Chance and probability are not fundamentally distinct and just require a correct application of probability theory
  • Probabilistic dispositional tendencies are distinct from non-probabilistic dispositional tendencies: this is a real and fruitful distinction
  • Deterministic and indeterministic dispositional tendencies also provide a useful distinction but it remains to be seen whether there are fundamental non-probabilistic indeterministic dispositions.

The next post will continue this theme with a discussion of dispositional tendencies in causality and quantum mechanics, engaging once more with the 2018 book by Anjum and Mumford.

 

The heart of the matter

The ontological framework for this blog is from Nicolai Hartmann's  new ontology  programme that was developed in a number of very subst...