{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Reasoning Under Uncertainty: Basics\n",
"\n",
"The real world is fully of *uncertainties*. The source of uncertainties can be from *true uncertainty* (e.g., the outcome of flipping a coin is unknown in advance), *theoretical ignorance* (e.g., lack of knowledge in medical diagnosis), *laziness* (e.g., do not want to collect useful information such as doing survey to understand customers' opinion about a product), *practical ignorance* (e.g., missing information due to security/privacy), etc.\n",
"\n",
"This lecture will introduce the basics of reasoning under uncertainty.\n",
"\n",
"# Table of Contents\n",
"1. [Random Event, Proposition, Variable, Domain](#epvd)\n",
"2. [Probability](#prob)\n",
"3. [Probability Rules](#probrules)\n",
"4. [Independence](#independence)\n",
"5. [Case Study on Weather Forecast](#case-study)\n",
"\n",
"------------"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Random Event, Proposition, Variable, Domain \n",
"\n",
"We can describe the uncertainty as **random events/propositions**. Following are some examples of propositions:\n",
"\n",
"1. Tomorrow will be rainy (The forecast is not always correct, and is only a guess).\n",
"2. The US stock market return will be at least 10% in 2022 (Again, this prediction cannot be guaranteed).\n",
"\n",
"A *basic* proposition can be presented by a **random variable**, which can take values from its **domain**. In the above example 1, the random variable is `weather`, whose domain is `{rainy, sunny, cloudy}`. In the example 2, the random variable is `US_stock_return`, whose domain is all the real values $\\mathbb{R}$.\n",
"\n",
"For the sake of simplicity, here we only consider *finite discrete* domains. We convert continuous domains to finite discrete ones by *discretisation*. For example, in the above example 2, we can discretise the continuous domain $\\mathbb{R}$ into `{very poor, poor, ok, good, very good}`, where\n",
"\n",
"- very poor $ = (-\\infty, -20\\%]$,\n",
"- poor $ = (-20\\%, -10\\%]$,\n",
"- ok $ = [-10\\%, 10\\%)$,\n",
"- good $ = [10\\%, 20\\%)$,\n",
"- very good $ = [20\\%, \\infty)$.\n",
"\n",
"In other words, a basic proposition $A$ can be presented as \n",
"\n",
"$$A: X = x,\n",
"$$\n",
"\n",
"where $X$ is a random variable in $A$, and $x$ is a value from its domain. Examples include `weather = rainy` and `US_stock_return = good`.\n",
"\n",
"A *composite* event/proposition combines one or more *basic* events/propositions with logic operators. Given two propositions $A$ and $B$,\n",
"\n",
"- $A\\ \\mathsf{AND}\\ B$ (denoted as $A \\wedge B$): both $A$ and $B$ are true.\n",
"- $A\\ \\mathsf{OR}\\ B$ (denoted as $A \\vee B$): $A$ is true or $B$ is true.\n",
"- $\\mathsf{NOT}\\ A$ (denoted as $\\neg A$): $A$ is not true ($A$ is false).\n",
"\n",
"In the above example 2, the proposition \"$A$: The US stock market return will be at least 10% in 2022\" is a composite proposition that can be written as \n",
"- $A$: `US_stock_return = good` $\\mathsf{OR}$ `US_stock_return = very good`.\n",
"\n",
"------------"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Probability \n",
"\n",
"**Probability** is to measure how likely an event occur, or a proposition is true. Given an event/proposition $A$, the probability for $A$ to be true is denoted as $P(A)$. The probability has the following axioms.\n",
"\n",
"- $0 \\leq P(A) \\leq 1$,\n",
"- $P(\\neg A) = 1-P(A)$,\n",
"- If $A$ is always true, then $P(A) = 1$; If $A$ is always false, then $P(A) = 0$,\n",
"- $\\sum_{x \\in \\Omega(X)}P(X = x) = 1$, where $\\Omega(X)$ is the domain of the random variable $X$.\n",
"\n",
"If we roll a *fair* die with *all the six numbers occur with the same probability*, we know that the probability of each number is as follows:\n",
"\n",
"| Die | 1 | 2 | 3 | 4 | 5 | 6 |\n",
"|------|-----|-----|-----|-----|-----|-----|\n",
"| Prob | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 |\n",
"\n",
"> **QUESTION**: If we roll two fair dice, What is the probability that the total number of the two dice is 11?\n",
"\n",
"---------\n",
"\n",
" Click to see the answer and explanation!
\n",
" \n",
"**ANSWER**: $1/18$.\n",
" \n",
"**Explanation**: Let random variables $X_1$ and $X_2$ denote the outcome numbers of the two dice, both with domain $\\{1,2,3,4,5,6\\}$. Then, there are $6 \\times 6 = 36$ possible outcomes for $X_1 = x_1$, $X_2 = x_2$ where $x_1, x_2 \\in \\{1,2,3,4,5,6\\}$. \n",
" \n",
"Among the 36 outcomes, there are two outcomes with total number of 11, i.e., (1) $X_1 = 5$, $X_2 = 6$ and (2) $X_1 = 6$, $X_2 = 5$. Since all the outcomes have the same probability, we have the probability $P(total = 1) = 2/36 = 1/18$.\n",
" \n",
"\n",
"---------\n",
"\n",
"### 2.1 Joint Probability\n",
"\n",
"For a *composite* proposition with the $\\mathsf{AND}$ logic, e.g., $A\\ \\mathsf{AND}\\ B$, the probability is also called the **joint probability** $P(A,B)$.\n",
"\n",
"$$\n",
"P(A,B) = P(A\\ \\mathsf{AND}\\ B).\n",
"$$\n",
"\n",
"In other words, the joint probability $P(A,B)$ means the probability that $A$ and $B$ are both true.\n",
"\n",
"### 2.2 Uncondition/Conditional Probabilities\n",
"\n",
"The **unconditional/prior** probability reflects the belief in propositions *in the absence of any other information/evidence*. Examples include:\n",
"- The prior probability that \"**a person gets lung cancer**\" is 0.01%. Let $L$ be the proposition that a person gets lung cancer, the prior probability can be written as $P(L) = 0.01\\%$.\n",
"- The prior probability that \"**tomorrow will rain**\" is 30%. Let $R$ be the proposition that tomorrow will rain, the prior probability can be written as $P(R) = 30\\%$.\n",
"\n",
"The **conditional** probability is the upldated belief in propositions *given some more information/evidence*. Below are possible examples:\n",
"- The conditional probability that \"**a person gets lung cancer**\" given that \"**the person is a regular smoker**\" is 20%. Let $L$ be the proposition that a person gets lung cancer, and $S$ be the proposition that a person is a regular smoker, the conditional probability can be written as $P(L | S) = 20\\%$.\n",
"- The conditional probability that \"**tomorrow will rain**\" given that \"**today is cloudy**\" is 60%. Let $R$ be the proposition that tomorrow will rain, and $C$ be the proposition that tomorrow is cloudy, then the conditional probability can be written as $P(R | C) = 60\\%$.\n",
"\n",
"We can see that the conditional probability of a proposition can be dramatically different from its prior probability. In other words, conditions/evidence can greatly change the belief of propositions. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Picking Objects from a Bag\n",
"\n",
"\n",
"\n",
"In the above example, there are 18 objects in a bag with different **colours** and **shapes**. The numbers of objects with each colour and shape are given below.\n",
"\n",
" | | Circle | Square | Triangle | Total |\n",
" | ---| ---| ---| --- | --- |\n",
" | Blue | 4 | 2 | 3 | 9 |\n",
" | White | 3 | 3 | 3 | 9 |\n",
" | Total | 7 | 5 | 6 | 18 |\n",
"\n",
"If we randomly pick an object from the bag, the `colour` and `shape` of the picked object are random variables.\n",
"\n",
"We can calculate the probabilities by counting the numbers. For example,\n",
"- The probability of picking a `blue` object is 9/18 (there are 9 blue objects out of the 18 objects in total), i.e., `P(blue) = 9/18`.\n",
"- The probability of picking a `white` and `square` object is 3/18 (there are 3 objects with white colour and square shape), i.e., `P(white, square) = 3/18`.\n",
"- The (conditional) probability that the picked object has a `circle` shape, given that we have seen its colour is `blue`, is 4/9 (there are 4 objects with circle shape among the 9 objects with blue colour), i.e., `P(circle | blue) = 4/9`.\n",
"\n",
"------------"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Probability Rules \n",
"\n",
"Below are the three commonly used probability rules.\n",
"\n",
"### Product Rule\n",
"\n",
"Given two propositions $A$ and $B$, we have\n",
"\n",
"$$\n",
"P(A,B) = P(A) * P(B | A).\n",
"$$\n",
"\n",
"The product rule can be explained by establishing the beliefs of $A$ and $B$ in two different ways: **parallel** and **sequential**.\n",
"\n",
"- **Parallel**: The probability that both $A$ and $B$ are true is directly denoted as $P(A,B)$ by definition.\n",
"- **Sequential**: First, we establish our belief for $A$ alone. The probability that $A$ is true is denoted as $P(A)$. Then, we establish our belief for $B$ given that $A$ is true. This conditional probability is $P(B|A)$. Combining these two beliefs together, the probability that both $A$ and $B$ are true is equivalent to \"$A$ is true\" $\\mathsf{AND}$ \"$B$ is true given $A$ is true\", which is $P(A) * P(B|A)$. \n",
"\n",
"Since we will reach the same belief through either parallel or sequential way, we have $P(A,B) = P(A) * P(B | A)$.\n",
"\n",
"In the sequential way, we can also establish our belief for $B$ first, and then $A$ given $B$. Then we can obtain $P(A,B) = P(B) * P(A | B)$. In other words, **the product rule works for any order of random variables**.\n",
"\n",
"We can further extend the product rule from two propositions to any number of propositions. Given a list of propositions $A_1, A_2, \\dots, A_n$, we have\n",
"\n",
"$$\n",
"P(A_1, A_2, \\dots, A_n) = P(A_1) * P(A_2 | A_1) * \\dots * P(A_n | A_1, A_2, \\dots, A_{n-1}).\n",
"$$\n",
"\n",
"This is the famous **[chain rule](https://en.wikipedia.org/wiki/Chain_rule_(probability))** for probability.\n",
"\n",
"### Sum Rule\n",
"\n",
"Given two discrete random variables $X$ and $Y$, with their domains as $\\Omega(X)$ and $\\Omega(Y)$, then for each $x \\in \\Omega(X)$, we have\n",
"\n",
"$$\n",
"P(X = x) = \\sum_{y \\in \\Omega(Y)} P(X = x, Y = y).\n",
"$$\n",
"\n",
"In other words, if we jointly consider a random variable $X$ with another random variable $Y$, then the *marginal* probability $P(X = x)$ is equivalent to the sum of its joint probabilities over all the possible $Y$ values.\n",
"\n",
"For example, if we consider tomorrow's weather $X_{t+1}$ jointly with today's weather $X_t$, where there are three possible weathers $\\{sunny, rainy, cloudy\\}$, then we have \n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"P(X_{t+1} = sunny) & = P(X_{t+1} = sunny, X_t = sunny) \\\\\n",
"& + P(X_{t+1} = sunny, X_t = rainy) \\\\\n",
"& + P(X_{t+1} = sunny, X_t = cloudy)\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"P(X_{t+1} = rainy) & = P(X_{t+1} = rainy, X_t = sunny) \\\\\n",
"& + P(X_{t+1} = rainy, X_t = rainy) \\\\\n",
"& + P(X_{t+1} = rainy, X_t = cloudy)\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"P(X_{t+1} = cloudy) & = P(X_{t+1} = cloudy, X_t = sunny) \\\\\n",
"& + P(X_{t+1} = cloudy, X_t = rainy) \\\\\n",
"& + P(X_{t+1} = cloudy, X_t = cloudy)\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"We can further extend the sum rule from two random variables to any number of variables. Given random variables $X_1, X_2, \\dots, X_m$ and $Y_1, Y_2, \\dots, Y_n$, then for each $x_1 \\in \\Omega(X_1), x_2 \\in \\Omega(X_2), \\dots x_m \\in \\Omega(X_m)$, we have\n",
"\n",
"$$\n",
"P(X_1 = x_1, \\dots, X_m = x_m) = \\sum_{y_i \\in \\Omega(Y_i), i = 1, \\dots, n} P(X_1 = x_1, \\dots, X_m = x_m, Y_1 = y_1, \\dots, Y_n = y_n).\n",
"$$\n",
"\n",
"### Normalisation Rule\n",
"\n",
"Given a discrete random variable $X$ with its domain $\\Omega(X)$, we have\n",
"\n",
"$$\n",
"\\sum_{x = \\Omega(X)} P(X) = 1.\n",
"$$\n",
"\n",
"It is obvious that the random variable must take one of the values in its domain. Therefore, the probabilities of all the possible values should add up to 1 (100%).\n",
"\n",
"The normalisation rule holds under any condition as well. Given any conditional proposition $A$, we have\n",
"\n",
"$$\n",
"\\sum_{x = \\Omega(X)} P(X | A) = 1.\n",
"$$\n",
"\n",
"In the above weather forecast example, the normalisation rule without condition can be written as:\n",
"\n",
"$$\n",
"P(X_{t+1} = sunny) + P(X_{t+1} = rainy) + P(X_{t+1} = cloudy) = 1,\n",
"$$\n",
"\n",
"Similarly, the normalisation rule under the condition $X_t = sunny$ can be written as:\n",
"\n",
"$$\n",
"P(X_{t+1} = sunny | X_t = sunny) + P(X_{t+1} = rainy | X_t = sunny) + P(X_{t+1} = cloudy | X_t = sunny) = 1,\n",
"$$\n",
"\n",
"---------"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Independence \n",
"\n",
"Given two random variables $X$ and $Y$, if for any possible value $x \\in \\Omega(X)$, the probability $P(X = x)$ does not depend on the $Y$ value, then we call that $X$ and $Y$ are **independent** of each other, denoted as $X \\perp Y$.\n",
"\n",
"If two random variables $X$ and $Y$ are independent, then for each $x \\in \\Omega(X)$ and $y \\in \\Omega(Y)$, we have\n",
"\n",
"$$\n",
"P(X = x | Y = y) = P(X = x).\n",
"$$\n",
"\n",
"This is obvious by definition, since our belief on $P(X = x)$ does not change at all no matter whether we have information on $Y$ or not.\n",
"\n",
"Similarly, we have \n",
"\n",
"$$\n",
"P(Y = y | X = x) = P(Y = y).\n",
"$$\n",
"\n",
"Further, combined with the product rule, we have\n",
"\n",
"$$\n",
"P(X = x, Y = y) = P(X = x) * P(Y = y),\n",
"$$ \n",
"\n",
"For example, if we flip a fair coin twice, where the two outcome `head` or `tail` have the same probability of 0.5, then the outcome of the two flips are independent of each other. We have\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"P(flip_2 = head | flip_1 = head) & = P(flip_2 = head) \\\\\n",
"& = 0.5,\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"P(flip_1 = head, flip_2 = head) & = P(flip_1 = head) * P(flip_2 = head) \\\\\n",
"& = 0.5 * 0.5 = 0.25.\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"For more variables, we say that $X_1, \\dots, X_n$ are independent with each other, if \n",
"\n",
"$$\n",
"P(X_1, \\dots, X_n) = P(X_1) * \\dots * P(X_n).\n",
"$$\n",
"\n",
"### Conditional Independence\n",
"\n",
"The independence can also be conditional. Given random variables $X$, $Y$ and $Z$, we say that $X$ and $Y$ are **conditionally independent** given $Z$, if for each $x \\in \\Omega(X)$, $y \\in \\Omega(Y)$ and $z \\in \\Omega(Z)$, we have\n",
"\n",
"$$\n",
"P(X = x, Y = y | Z = z) = P(X = x | Z = z) * P(Y = y | Z = z).\n",
"$$\n",
"\n",
"> We can see that the formula of conditional independence is obtained by adding the condition (e.g., $| Z = z)$ into each probability of the unconditional formula.\n",
"\n",
"For more variables, we say that $X_1, \\dots, X_n$ are independent given $Y_1, \\dots, Y_m$, if \n",
"\n",
"$$\n",
"P(X_1, \\dots, X_n | Y_1, \\dots, Y_m) = P(X_1 | Y_1, \\dots, Y_m) * \\dots * P(X_n | Y_1, \\dots, Y_m).\n",
"$$\n",
"\n",
"--------------"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Case Study on Weather Forecast \n",
"\n",
"In this case study, we aim to forecast the weather in the future. The weather is denoted as random variable $X$, and its domain (possible weathers) is $\\{Sunny, Rainy\\}$.\n",
"\n",
"For the weather tomorrow, denoted as $X_t$, the forecast says that it will be sunny with probability of 20%, and rainy with probability of 80% (obviously, they sum up to 100% due to normalisation rule). That is,\n",
"\n",
"$$\n",
"P(X_t = sunny) = 0.2,\\\\\n",
"P(X_t = rainy) = 0.8.\n",
"$$\n",
"\n",
"We also know that the weather tends to change smoothly over time. Thus, the weather for the day after tomorrow, denoted as $X_{t+1}$, depends on the weather tomorrow as follows.\n",
"\n",
"- If tomorrow is sunny, then the day after tomorrow will be sunny with probability of 60%, and rainy with probability of 40%.\n",
"- If tomorrow is raining, then the day after tomorrow will be sunny with probability of 30%, and rainy with probability of 70%.\n",
"\n",
"In other words, we have\n",
"\n",
"$$\n",
"P(X_{t+1} = sunny | X_t = sunny) = 0.6, \\\\\n",
"P(X_{t+1} = rainy | X_t = sunny) = 0.4,\n",
"$$\n",
"\n",
"$$\n",
"P(X_{t+1} = sunny | X_t = rainy) = 0.3, \\\\\n",
"P(X_{t+1} = rainy | X_t = rainy) = 0.7.\n",
"$$\n",
"\n",
"### QUESTION\n",
"\n",
"What is the probability that the day after tomorrow will be sunny?\n",
"\n",
"### Answer\n",
"\n",
"The probability that the day after tomorrow will be sunny can be written as $P(X_{t+1} = sunny)$.\n",
"\n",
"First, we can use the **sum rule** to take the weather tomorrow into consideration as follows:\n",
"\n",
"$$\n",
"P(X_{t+1} = sunny) = P(X_{t+1} = sunny, X_t = sunny) + P(X_{t+1} = sunny, X_t = rainy).\n",
"$$\n",
"\n",
"Then, for each term on the right hand side, we can use the **product rule** that considers $X_t$ first, and then $X_{t+1}$ given $X_t$. Specifically, we have\n",
"\n",
"$$\n",
"P(X_{t+1} = sunny, X_t = sunny) = P(X_t = sunny) * P(X_{t+1} = sunny | P(X_t = sunny), \\\\\n",
"P(X_{t+1} = sunny, X_t = rainy) = P(X_t = rainy) * P(X_{t+1} = sunny | P(X_t = rainy).\n",
"$$\n",
"\n",
"All the above probabilities on the right hand side are known from the forecast. Therefore, we have\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"P(X_{t+1} = sunny, X_t = sunny) & = P(X_t = sunny) * P(X_{t+1} = sunny | P(X_t = sunny) \\\\\n",
"& = 0.2 \\times 0.6 = 0.12,\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"P(X_{t+1} = sunny, X_t = rainy) & = P(X_t = rainy) * P(X_{t+1} = sunny | P(X_t = rainy) \\\\\n",
"& = 0.8 \\times 0.3 = 0.24.\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"Therefore, we have \n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"P(X_{t+1} = sunny) & = P(X_{t+1} = sunny, X_t = sunny) + P(X_{t+1} = sunny, X_t = rainy) \\\\\n",
"& = 0.12 + 0.24 = 0.36.\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"--------"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The original Juypter Notebook can be downloaded [here](https://homepages.ecs.vuw.ac.nz/~yimei/tutorials/reasoning-under-uncertainty-basics.ipynb).\n",
"\n",
"More tutorials can be found [here](https://meiyi1986.github.io/tutorials/).\n",
"\n",
"[Yi Mei's homepage](https://meiyi1986.github.io/)"
]
}
],
"metadata": {
"colab": {
"authorship_tag": "ABX9TyOLYiX4MNVN+YKTRqYwBPrZ",
"collapsed_sections": [],
"name": "probability_product_rule.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
}