3.1.1. Key Ideas#

Sample space : The sample space is the set of all possible outcomes of an experiment. For example, if we toss a coin, the sample space is {heads, tails}.

Event : An event is a subset of the sample space. For example, if we toss a coin, the event “getting heads” is the subset {heads}.

Probability measure : The probability measure assigns a probability value between 0 and 1 to each event. The probability of an event is a measure of the likelihood of the event occurring. The sum of probabilities of all events in the sample space is 1.

Probability axioms : The probability measure is defined by three axioms: non-negativity, additivity, and normalization. Non-negativity states that the probability of any event is non-negative. Additivity states that the probability of the union of two disjoint events is the sum of their probabilities. Normalization states that the probability of the sample space is 1.

Conditional probability : The conditional probability of an event A given an event B is the probability that event A occurs, given that event B has occurred. It is denoted by P(A|B) and is defined as:

P(A|B) = P(A and B) / P(B)

Bayes’ theorem : Bayes’ theorem is a formula that describes the probability of an event, based on prior knowledge of conditions that might be related to the event. It is often used in statistical inference and machine learning.

Random variables : A random variable is a variable that takes on different values depending on the outcome of a random event. For example, if we toss a coin, the random variable X could be defined as follows:

X = 1 if heads X = 0 if tails

Probability distribution : The probability distribution of a random variable is a function that assigns a probability value to each possible value of the variable. The distribution is characterized by its mean, variance, and other statistical properties.

Expected value : The expected value of a random variable is the weighted average of its possible values, weighted by their respective probabilities. It is denoted by E(X) and is defined as:

E(X) = sum of x P(X=x) where the sum is taken over all possible values x of X.