Probability theory

Probability theory makes a number of pragmatic abuses of notation. There are a lot of symbols, such as the magic function \( P \) (sometimes written \(\mathbb{P}\) or \(\mathrm{Pr}\)), which appear to accept almost any expression as an argument!

Unlike other areas of mathematics, it is the names of variables that matter, rather than the order in which they appear.

Random Variables
Variable names matter!

Probability Distributions
The magic function \( P \) (sometimes written \(\mathbb{P}\) or \(\mathrm{Pr}\)) can accept almost any expression as an argument! Consider:


 * When \(X\) is a random variable, \( P(X) \) is its probability distribution.
 * When \(X\) is an event, \( P(X) \) is the probability of that event with respect to some implicit underlying distribution
 * When \(X\) and \(Y\) are random variables, \( P(X|Y) \) is the conditional probability distribution of \(X\) given \(Y\), while ( P(Y|X) \) is the conditional probability distribution of \(Y\) given \(X\).
 * \(P(A,B,C|X,Y,Z)\) may or may not be equivalent to \(P(B,A,C|Y,Z,X)\), depending on who you ask.
 * Sometimes we use \( P(X_1, X_2, X_3) \) to mean the probability of observing a sequence events \(X_1\), \(X_2\), and \(X_3\) in that particular order, in contrast to \( P(X_2, X_1, X_3) \).
 * In Bayesian statistics, \( P(X|Y;\theta) \) or \( P_\theta(X|Y) \) means "the probability of X, given Y, parameterized by \(\theta\)".

Expected Value
See Carpenter 2020, "Abuse of Expectation Notation"