Probability for Data Science
eBook  ›  Chapter 5 · Joint Distributions
Section 5.5

Sum of Two Random Variables

One typical problem we encounter in engineering is to determine the PDF of the sum of two random variables \(X\) and \(Y\), i.e., \(X+Y\). Such a problem arises naturally when we want to evaluate the average of many random variables, e.g., the sample mean of a collection of data points. This section will discuss a general principle for determining the PDF of a sum of two random variables.

5.5.1Intuition through convolution

First, consider two random variables, \(X\) and \(Y\), both discrete uniform random variables in the range of \(0,1,2,3\). That is, \(p_X(x) = p_Y(y) = [1/4, 1/4, 1/4, 1/4]\). Since this is such a simple problem we can enumerate all the possible cases of the sum \(Z = X+Y\). The resulting probabilities are shown in the following table.

\(Z = X+Y\)Cases, written in terms of (X, Y)Probability
0(0,0)1/16
1(0,1), (1,0)2/16
2(1,1), (2,0), (0,2)3/16
3(3,0), (2,1), (1,2), (0,3)4/16
4(3,1), (2,2), (1,3)3/16
5(3,2), (2,3)2/16
6(3,3)1/16

Clearly, the PMF of \(Z\) is not \(f_Z(z) = f_X(x) + f_Y(y)\). (Caution! Do not write this.) The PMF of \(Z\) looks like a triangle distribution. How can we get to this triangle distribution from two uniform distributions? The key is the idea of convolution. Let us start with the PMF of \(X\), which is \(p_X(x)\). Let us also flip \(p_Y(y)\) over the \(y\)-axis. As we shift the flipped \(p_Y\), we multiply and add the PMF values as shown in Figure 5.11. This gives us

$$\begin{aligned} p_Z(0) &= \Pb[X+Y = 0] \\ &= \Pb[(X,Y) = (0,0)] \\ &= p_X(0)p_Y(0) \\ &= \frac{1}{16}. \end{aligned}$$

Now, if we shift towards the right by 1, we have

$$\begin{aligned} p_Z(1) &= \Pb[X+Y = 1] \\ &= \Pb[(X,Y) \in \{ (0,1) \cup (1,0)\}] \\ &= p_X(0)p_Y(1) + p_X(1)p_Y(0) = \frac{2}{16}. \end{aligned}$$

By continuing our argument, you can see that we will obtain the same PMF as the one shown in the table.

Figure 5.11
Figure 5.11. When summing two random variables \(X\) and \(Y\), we are effectively taking the convolutions of the two respective PMF / PDFs.

5.5.2Main result

We can show that for any arbitrary random variable \(X\) and \(Y\), the sum \(Z = X+Y\) has a distribution that is the convolution of two individual PDFs.

Theorem 5.10

Let \(X\) and \(Y\) be two independent random variables with PDFs \(f_X(x)\) and \(f_Y(y)\) respectively. Let \(Z = X+Y\). The PDF of \(Z\) is given by

$$f_Z(z) = (f_X \ast f_Y)(z) = \int_{-\infty}^{\infty} f_X(z-y) f_Y(y) \;dy,$$

where “\(\ast\)” denotes the convolution.

Proof. We begin by analyzing the CDF of \(Z\). The CDF of \(Z\) is

$$\begin{aligned} F_Z(z) = \Pb[Z \le z] = \Pb[X+Y \le z]. \end{aligned}$$

We now draw a picture to illustrate the line under which we want to integrate. As shown in Figure 5.12, the equation \(X+Y\le z\) defines a straight line in the \(xy\) plane. You can think of it as \(Y \le -X + z\), so that the slope is \(-1\) and the \(y\)-intercept is \(z\).

Now, shall we take the upper half of the triangle or the lower half? Since the equation is \(Y \le -X + z\), a value of \(Y\) has to be less than that of the line. Another easy way to check is to assume \(z > 0\) so that we have a positive \(y\)-intercept. Then we check where the origin \((0,0)\) belongs. In this case, if \(z > 0\), the origin \((0,0)\) will satisfy the equation \(Y \le -X + z\), and so it must be included. Thus, we conclude that the area is below the line.

Figure 5.12
Figure 5.12. The shaded region highlights the set \(X + Y \le z\). To integrate the PDF over this region, we first take the inner integration over \(dx\) and then take the outer integration over \(dy\).

Once we have determined the area to be integrated, we can write down the integration:

$$\begin{aligned} \Pb[X+Y \le z] &= \int_{-\infty}^{\infty} \int_{-\infty}^{z-y} f_{X,Y}(x,y) \;dx\;dy\\ &= \int_{-\infty}^{\infty} \int_{-\infty}^{z-y} f_X(x) f_Y(y) \;dx\;dy, \quad \text{(independence)} \end{aligned}$$

where the integration limits are just a rewrite of \(X+Y \le z\) (in this case, since we are integrating \(x\) first, we have \(X \le -Y + z\)). Then, by the fundamental theorem of calculus, we can show that

$$\begin{aligned} f_Z(z) = \frac{d}{dz} F_Z(z) &= \frac{d}{dz} \int_{-\infty}^{\infty} \int_{-\infty}^{z-y} f_X(x) f_Y(y) \;dx\;dy\\ &= \int_{-\infty}^{\infty} \left(\frac{d}{dz} \int_{-\infty}^{z-y} f_X(x) f_Y(y) \;dx \right) \;dy \\ &= \int_{-\infty}^{\infty} f_X(z-y) f_Y(y) \;dy = (f_X \ast f_Y)(z), \end{aligned}$$

where “\(\ast\)” denotes the convolution.

How is convolution related to random variables?
  • sep0ex
  • If you sum \(X\) and \(Y\), the resulting PDF is the convolution of \(f_X\) and \(f_Y\).
  • E.g., convolving two uniform random variables gives you a triangle PDF.

5.5.3Sum of common distributions

Theorem 5.11 (Sum of two Poissons)

Let \(X_1 \sim \text{Poisson}(\lambda_1)\) and \(X_2 \sim \text{Poisson}(\lambda_2)\). Then

$$X_1 + X_2 \sim \text{Poisson}(\lambda_1 + \lambda_2).$$

Proof. Let us apply the convolution principle.

$$\begin{aligned} p_Y(k) &= \Pb[X_1 + X_2 = k] \\ &= \sum_{\ell=0}^{k} \Pb[X_1 = \ell \;\cap\; X_2 = k - \ell]\\ &= \sum_{\ell=0}^{k} \frac{\lambda_1^{\ell} e^{-\lambda_1}}{\ell!} \cdot \frac{\lambda_2^{k-\ell} e^{-\lambda_2}}{(k-\ell)!}\\ &= e^{-(\lambda_1 + \lambda_2)} \sum_{\ell=0}^{k} \frac{\lambda_1^{\ell}}{\ell!} \cdot \frac{\lambda_2^{k-\ell}}{(k-\ell)!}\\ &= e^{-(\lambda_1 + \lambda_2)} \cdot \textcolor{red}{\frac{1}{k!}} \underset{=\sum_{\ell=0}^k {k \choose \ell} \lambda_1^{\ell}\lambda_2^{k-\ell}}{\underbrace{\sum_{\ell=0}^{k} \frac{\textcolor{red}{k!}}{\ell! (k-\ell)!} \lambda_1^{\ell} \lambda_2^{k-\ell}}} \\ &= \frac{\left(\lambda_1+\lambda_2\right)^k }{k!}e^{-(\lambda_1 + \lambda_2)}, \end{aligned}$$

where the last step is based on the binomial identity \(\sum_{\ell=0}^k {k \choose \ell} a^{\ell}b^{k-\ell} = (a+b)^k\).

Theorem 5.12 (Sum of two Gaussians)

Let \(X_1\) and \(X_2\) be two Gaussian random variables such that $$X_1 \sim \text{Gaussian}(\mu_1,\sigma_1^2) \;\quad\text{and}\quad\; X_2 \sim \text{Gaussian}(\mu_2,\sigma_2^2).$$ Then

$$X_1 + X_2 \sim \text{Gaussian}(\mu_1 + \mu_2, \sigma_1^2+\sigma_2^2).$$

Proof. Let us apply the convolution principle by defining \(Z = X_1 + X_2\). Then,

$$\begin{aligned} f_{Z}(z) &= \int_{-\infty}^{\infty} f_{X_1}(t)f_{X_2}(z-t)\;dt \\ &= \int_{-\infty}^{\infty} \frac{1}{\sqrt{2\pi\sigma_1^2}} \exp\left\{-\frac{(t-\mu_1)^2}{2\sigma_1^2}\right\} \cdot \frac{1}{\sqrt{2\pi\sigma_2^2}} \exp\left\{-\frac{(z-t-\mu_2)^2}{2\sigma_2^2}\right\} \;dt. \end{aligned}$$

We now complete the square:

$$\begin{aligned} \frac{(t-\mu_1)^2}{2\sigma_1^2} + \frac{(z-t-\mu_2)^2}{2\sigma_2^2} &= \frac{\sigma_2^2(t-\mu_1)^2 + \sigma_1^2(z-t-\mu_2)^2}{2\sigma_1^2\sigma_2^2}\\ &\hspace{-12ex}= \frac{\sigma_1^2+\sigma_2^2}{2\sigma_1^2\sigma_2^2}\left[t - \frac{\sigma_2^2\mu_1 + \sigma_1^2(z-\mu_2)}{\sigma_1^2+\sigma_2^2}\right]^2 + \frac{(z-\mu_1-\mu_2)^2}{2(\sigma_1^2+\sigma_2^2)}. \end{aligned}$$

Substituting these into the integral, we can show that

$$\begin{aligned} f_Z(z) &= \frac{1}{2\pi\sqrt{\sigma_1^2\sigma_2^2}} \exp\left\{-\frac{(z-\mu_1-\mu_2)^2}{2(\sigma_1^2+\sigma_2^2)}\right\} \\ &\quad\cdot\underset{=\sqrt{2\pi \cdot \frac{\sigma_1^2\sigma_2^2}{\sigma_1^2+\sigma_2^2}}}{\underbrace{\int_{-\infty}^{\infty} \exp\left\{-\frac{\sigma_1^2+\sigma_2^2}{2\sigma_1^2\sigma_2^2}\left[t - \frac{\sigma_2^2\mu_1 + \sigma_1^2(z-\mu_2)}{\sigma_1^2+\sigma_2^2}\right]^2\right\} \;dt}} \\ &= \frac{1}{\sqrt{2\pi(\sigma_1^2+\sigma_2^2)}} \exp\left\{-\frac{(z-\mu_1-\mu_2)^2}{2(\sigma_1^2+\sigma_2^2)}\right\}. \end{aligned}$$

Therefore, we have shown that the resulting distribution is a Gaussian with mean \(\mu_1+\mu_2\) and variance \(\sigma_1^2+\sigma_2^2\).

Practice Exercise 5.11

Let \(X\) and \(Y\) be independent, and let

$$f_X(x) = \begin{cases} xe^{-x}, &\quad x \ge 0,\\ 0 , &\quad x < 0, \end{cases} \quad\mbox{and}\quad f_Y(y) = \begin{cases} ye^{-y}, &\quad y \ge 0,\\ 0 , &\quad y < 0. \end{cases}$$

Find the PDF of \(Z = X+Y\).

Solution

Using the results derived above, we see that

$$\begin{aligned} f_Z(z) &= \int_{-\infty}^{\infty} f_X(z-y) f_Y(y) \;dy \\ &= \int_{-\infty}^{z} f_X(z-y) f_Y(y) \;dy, \end{aligned}$$

where the upper limit \(z\) came from the fact that \(x \ge 0\). Therefore, since \(Z = X + Y\), we must have \(Z - Y = X \ge 0\) and so \(Z \ge Y\). This is portrayed graphically in Figure 5.13. Substituting the PDFs into the integration yields

$$\begin{aligned} f_Z(z) &= \int_{0}^{z} (z-y)e^{-(z-y)} ye^{-y} \;dy = \frac{z^3}{6}e^{-z}, \quad z \ge 0. \end{aligned}$$

For \(z < 0\), \(f_Z(z) = 0\).

Figure 5.13
Figure 5.13. [Left] The outer integral goes from 0 to \(z\) because the triangle stops at \(y = z\). [Right] If the triangle is unbounded, then the integral goes from \(-\infty\) to \(\infty\).

The functions of two random variables are not limited to summation. The following example illustrates the case of the product of two random variables.

Example 5.23

Let \(X\) and \(Y\) be two independent random variables such that

$$f_X(x) = \begin{cases} 2x, &\quad \mbox{if}\;\; 0 \le x \le 1,\\ 0, &\quad \mbox{otherwise}, \end{cases} \quad\mbox{and}\quad f_Y(y) = \begin{cases} 1, &\quad \mbox{if}\;\; 0 \le y \le 1,\\ 0, &\quad \mbox{otherwise}. \end{cases}$$

Let \(Z = XY\). Find \(f_Z(z)\).

Solution

The CDF of \(Z\) can be evaluated as

$$\begin{aligned} F_Z(z) = \Pb[Z \le z] = \Pb[XY \le z] = \int_{-\infty}^{\infty} \int_{-\infty}^{\frac{z}{y}} f_X(x) f_Y(y) \;dx \;dy. \end{aligned}$$

Taking the derivative yields

$$\begin{aligned} f_Z(z) = \frac{d}{dz} F_Z(z) &= \frac{d}{dz} \int_{-\infty}^{\infty} \int_{-\infty}^{\frac{z}{y}} f_X(x) f_Y(y) \;dx \;dy\\ &\overset{(a)}{=} \int_{-\infty}^{\infty} \frac{1}{y} f_X\left(\frac{z}{y}\right)f_Y(y) \;dy, \end{aligned}$$

where (a) holds by the fundamental theorem of calculus. The upper and lower limit of this integration can be determined by noting that

$$\begin{aligned} 0 \le \frac{z}{y} = x \le 1, \end{aligned}$$

which implies that \(z \le y\). Since \(y \le 1\), we have that \(z \le y \le 1\). Therefore, the PDF is

$$\begin{aligned} f_Z(z) &= \int_{z}^{1} \frac{1}{y} f_X\left(\frac{z}{y}\right) f_Y(y) \;dy \\ &= \int_z^1 \frac{2z}{y^2} \;dy = 2(1-z), \quad z \ge 0. \end{aligned}$$

For \(z<0\), \(f_Z(z) = 0\).

Closing remark. For some random variables, summing two i.i.d. copies remains the same random variable (but with different parameters). For other random variables, summing two i.i.d. copies gives a different random variable. Table 5.1 summarizes some of the most commonly used random variable pairs.

\(X_1\)\(X_2\)Sum \(X_1 + X_2\)
\(\text{Bernoulli}(p)\)\(\text{Bernoulli}(p)\)\(\text{Binomial}(2,p)\)
\(\text{Binomial}(n,p)\)\(\text{Binomial}(m,p)\)\(\text{Binomial}(m+n,p)\)
\(\text{Poisson}(\lambda_1)\)\(\text{Poisson}(\lambda_2)\)\(\text{Poisson}(\lambda_1+\lambda_2)\)
\(\text{Exponential}(\lambda)\)\(\text{Exponential}(\lambda)\)\(\text{Erlang}(2,\lambda)\)
\(\text{Gaussian}(\mu_1,\sigma_1^2)\)\(\text{Gaussian}(\mu_2,\sigma_2^2)\)\(\text{Gaussian}(\mu_1+\mu_2,\sigma_1^2+\sigma_2^2)\)
Table 5.1. Common distributions of the sum of two random variables.