Cox's theorem, named after the physicist
Richard Threlkeld Cox, is essentially a derivation of the rules of
Bayesian probability theory from a given set of postulates. (Some construe "Bayesian probability" as "subjective probability", but the term is also used for "logical probability" and other degree-of-belief or epistemic interpretations of probability). Cox postulated:
- Divisibility and comparability - The plausibility of a statement is a real number and is dependent on information we have related to the statement.
- Common sense - Plausibilities should vary sensibly with the assessment of plausibilities in the model.
- Consistency - If the plausibility of a statement can be derived in two ways, the two results must be equal.
The postulates as stated here are taken from Arnborg and Sjödin (1999).
"Common sense" includes consistency with Aristotelian logic when
statements are completely plausible or implausible.
The theorem implies that any plausibility model that meets the
postulates is equivalent to the subjective probability model, i.e.,
can be converted to the probability model by rescaling.
The postulates as originally stated by Cox were not mathematically
rigorous (although better than the informal description above), e.g.,
as noted by Halpern (1999a, 199b). However it appears to be possible
to augment them with various mathematical assumptions made either
implicitly or explicitly by Cox to produce a valid proof.
Cox's theorem has come to be used as one of the justifications for the
use of Bayesian probability theory. E.g., in Jaynes (1996) it is
discussed in detail in chapters 1 and 2 and is a cornerstone for the
rest of the book. Bayesianism is interpreted as a formal system of
logic, the natural extension of Aristotelian logic (in which every
statement is either true or false) into the realm of reasoning in the
presence of uncertainty.
It has been debated to what degree the theorem excludes alternative
models for reasoning about uncertainty. For example, if certain
"unintuitive" mathematical assumptions were dropped then alternatives
could be devised, e.g., an example provided by Halpern (1999a).
However Arnborg and Sjödin (1999, 2000a, 2000b) suggest additional
"common sense" postulates, which would allow the assumptions to be
relaxed in some cases while still ruling out the Halpern example.
Jaynes (1996) cites Abel (1826) as first known instance of the associativity functional equation which is used in the proof of the theorem.
The original formulation is in Cox (1946), which is extended with additional results and more discussion in Cox (1961).
Aczél (1966) refers to the "associativity equation" and lists 98 references to works that discuss it or use it, and gives a proof that doesn't require differentiability (pages 256-267).
References and external links
- Niels Henrik Abel "Untersuchung der Functionen zweier unabhängig veränderlichen Gröszen x und y, wie f(x, y), welche die Eigenschaft haben, dasz f[z, f(x,y)] eine symmetrische Function von z, x und y ist.", Jour. Reine u. angew. Math. (Crelle's Jour.), 1, 11-15, (1826).
- R. T. Cox, "Probability, Frequency, and Reasonable Expectation," Am. Jour. Phys., 14, 1-13, (1946).
- R. T. Cox, The Algebra of Probable Inference, Johns Hopkins University Press, Baltimore, MD, (1961).
- Janos Aczél[?], Lectures on Functional Equations and their Applications, Academic Press, New York, (1966).
- Terrence L. Fine[?], Theories of Probability; An examination of foundations, Academic Press, New York, (1973).
- Edwin Thompson Jaynes, Probability Theory: The Logic of Science, Preprint: Washington University, (1996). -- http://omega.albany.edu:8008/JaynesBook.html and http://bayes.wustl.edu/etj/prob/book.pdf
- Joseph Y. Halpern, "A counterexample to theorems of Cox and Fine," Journal of AI research, 10, 67-85 (1999) -- http://www.cs.washington.edu/research/jair/abstracts/halpern99a.html
- Joseph Y. Halpern, "Technical Addendum, Cox's theorem Revisited," Journal of AI research, 11, 429-435 (1999) -- http://www.cs.washington.edu/research/jair/abstracts/halpern99b.html
- Stefan Arnborg and Gunnar Sjödin, On the foundations of Bayesianism, Preprint: Nada, KTH (1999) -- ftp://ftp.nada.kth.se/pub/documents/Theory/Stefan-Arnborg/06arnborg.ps -- ftp://ftp.nada.kth.se/pub/documents/Theory/Stefan-Arnborg/06arnborg.pdf
- Stefan Arnborg and Gunnar Sjödin, A note on the foundations of Bayesianism, Preprint: Nada, KTH (2000a) -- ftp://ftp.nada.kth.se/pub/documents/Theory/Stefan-Arnborg/fobshle.ps -- ftp://ftp.nada.kth.se/pub/documents/Theory/Stefan-Arnborg/fobshle.pdf
- Stefan Arnborg and Gunnar Sjödin, "Bayes rules in finite models," in European Conference on Artificial Intelligence, Berlin, (2000b) -- ftp://ftp.nada.kth.se/pub/documents/Theory/Stefan-Arnborg/fobc1.ps -- ftp://ftp.nada.kth.se/pub/documents/Theory/Stefan-Arnborg/fobc1.pdf
- Michael Hardy, in Advances in Applied Mathematics (http://www.sciencedirect.com/science/journal/01968858) August 2002, pages 243-292 (or preprint (http://arxiv.org/abs/math.PR/0203249)) "I assert there that I think Cox's assumptions are too strong, although I don't really say why. I do say what I would replace them with." (The quote is from a Wikipedia discussion page, not from the article.)