Conditional Expectation
5.4.1Definition
When dealing with two dependent random variables, at times we would like to determine the expectation of a random variable when the second random variable takes a particular state. The conditional expectation is a formal way of doing so.
The conditional expectation of \(X\) given \(Y = y\) is
for discrete random variables, and
for continuous random variables.
There are two points to note here. First, the expectation of \(\E[X \,|\, Y = y]\) is taken with respect to \(f_{X|Y}(x|y)\). We assume that the random variable \(Y\) is already fixed at the state \(Y = y\). Thus, the only source of randomness is \(X\). Secondly, since the expectation \(\E[X \,|\, Y = y]\) has eliminated the randomness of \(X\), the resulting function is in \(y\).
0ex What is conditional expectation?
- sep0ex
- \(\E[X|Y=y]\) is the expectation using \(f_{X|Y}(x|y)\).
- The integration is taken w.r.t. \(x\), because \(Y = y\) is given and fixed.
5.4.2The law of total expectation
The law of total expectation states that
Proof. We will prove the discrete case only, as the continuous case can be proved by replacing summation with integration.
Figure 5.10 illustrates the idea behind the proof. Essentially, we decompose the expectation \(\E[X]\) into “subexpectations” \(\E[X|Y = y]\). The probability of each subexpectation is \(p_Y(y)\). By summing the subexpectation multiplied by \(p_Y(y)\), we obtain the overall expectation.

0ex What is the law of total expectation?
- sep0ex
- The law of total expectation is a decomposition rule.
- It decomposes \(\E[X]\) into smaller/easier conditional expectations.
This law can also be written in a more compact form.
Let \(X\) and \(Y\) be two random variables. Then
Proof. The previous theorem states that \(\E[X] = \sum_y \E[X|Y = y] p_Y(y)\). If we treat \(\E[X|Y = y]\) as a function of \(y\), for instance \(h(y)\), then
Suppose there are two classes of cars. Let \(X\) be the speed of a car and \(C\) be the class. When \(C = 1\), we know that \(X \sim \text{Gaussian}(\mu_1,\sigma_1^2)\). We know that \(\Pb[C = 1] = p\). When \(C = 2\), \(X \sim \text{Gaussian}(\mu_2,\sigma_2^2)\). Also, \(\Pb[C = 2] = 1-p\). If you see a car on the freeway, what is its average speed?
The problem has given us everything we need. In particular, we know that the conditional PDFs are:
Therefore, conditioned on \(C\), we have two expectations:
The overall expectation \(\E[X]\) is
Consider a joint PMF given by the following table. Find \(\E[X|Y = 10^2]\) and \(\E[X|Y = 10^4]\).
| \(Y\) | \(10^4\) | 0 | 0 | \(\frac{1}{12}\) | \(\frac{1}{18}\) | \(\frac{1}{36}\) |
| \(10^2\) | \(\frac{5}{12}\) | \(\frac{5}{18}\) | \(\frac{5}{36}\) | 0 | 0 | |
| 0.01 | 0.1 | 1 | 10 | 100 | ||
| \(X\) |
To find the conditional expectation, we first need to know the conditional PMF.
Therefore, the conditional expectations are
From the conditional expectations we can also find \(\E[X]\):
Consider two random variables \(X\) and \(Y\). The random variable \(X\) is Gaussian-distributed with \(X \sim \text{Gaussian}(\mu,\sigma^2)\). The random variable \(Y\) has a conditional distribution \(Y|X \sim \text{Gaussian}(X,X^2)\). Find \(\E[Y]\).
The notation \(Y|X \sim \text{Gaussian}(X,X^2)\) means that given the variable \(X\), the other variable \(Y\) has a conditional distribution \(\text{Gaussian}(X,X^2)\). That is, the variable \(Y\) is a Gaussian with mean \(X\) and variance \(X^2\). How can the mean be a random variable \(X\) and the variance be another random variable \(X^2\)? Because \(X\) is the conditional variable. \(Y|X\) means that you have already chosen one state of \(X\). Given that particular state, the distribution of \(Y\) follows \(f_{Y|X}\). Therefore, for this problem, we know the PDFs:
The conditional expectation of \(Y\) given \(X\) is
The last equality holds because we are computing the expectation of a Gaussian random variable with mean \(x\). Finally, applying the law of total expectation, we can show that
where the last equality is based on the fact that it is the mean of a Gaussian.
Find \(\E[\sin(X+Y)]\), if \(X \sim \text{Gaussian}(0,1)\), and \(Y \,|\, X \sim \mbox{Uniform}[x-\pi,x+\pi]\).
We know that the conditional density is
Therefore, we can compute the conditional expectation
Hence, the overall expectation is