Probability for Data Science
eBook  ›  Chapter 4 · Continuous Random Variables
Section 4.7

Functions of Random Variables

One common question we encounter in practice is the transformation of random variables. The question can be summarized as follows: Given a random variable \(X\) with PDF \(f_X(x)\) and CDF \(F_X(x)\), and supposing that \(Y = g(X)\) for some function \(g\), what are \(f_Y(y)\) and \(F_Y(y)\)? This is a prevalent question. For example, we measure the voltage \(V\), and we want to analyze the power \(P = V^2/R\). This involves taking the square of a random variable. Another example: We know the distribution of the phase \(\Theta\), but we want to analyze the signal \(\cos(\omega t + \Theta)\). This involves a cosine transformation. How do we convert one variable to another? Answering this question is the goal of this section.

4.7.1General principle

We will first outline the general principle for tackling this type of problem. In the following subsection, we will give a few concrete examples.

Suppose we are given a random variable \(X\) with PDF \(f_X(x)\) and CDF \(F_X(x)\). Let \(Y = g(X)\) for some known and fixed function \(g\). For simplicity, we assume that \(g\) is monotonically increasing. In this case, the CDF of \(Y\) can be determined as follows.

$$\begin{aligned} F_Y(y) \overset{(a)}{=} \Pb[Y \le y] &\overset{(b)}{=} \Pb[g(X) \le y]\\ &\overset{(c)}{=} \Pb[X \le g^{-1}(y)] \\ &\overset{(d)}{=} F_X(g^{-1}(y)). \end{aligned}$$

This sequence of steps is not difficult to understand. Step (a) is the definition of CDF. Step (b) substitutes \(g(X)\) for \(Y\). Step (c) uses the fact that since \(g\) is invertible, we can apply the inverse of \(g\) to both sides of \(g(X)\le y\) to yield \(X \le g^{-1}(y)\). Step (d) is the definition of the CDF, but this time applied to \(\Pb[X \le \clubsuit] = F_X(\clubsuit)\), for some \(\clubsuit\).

It will be useful to visualize the situation in Figure 4.34. Here, we consider a uniformly distributed \(X\) so that the CDF \(F_X(x)\) is a straight line. According to \(F_X\), any samples drawn according to \(F_X\) are equally likely, as illustrated by the yellow dots on the \(x\)-axis. As we transform the \(X\)'s through \(Y = g(X)\), we increase/decrease the spacing between two samples. Therefore, some samples become more concentrated while some become less concentrated. The distribution of these transformed samples (the yellow dots on the \(y\)-axis) forms a new CDF \(F_Y(y)\). The result \(F_Y(y) = F_X(g^{-1}(y))\) holds when we look at \(Y\). The samples are traveling with \(g^{-1}\) in order to go back to \(F_X\). Therefore, we need \(g^{-1}\) in the formula.

Figure 4.34
Figure 4.34. When transforming a random variable \(X\) to \(Y = g(X)\), the distributions are defined according to the spacing between samples. In this figure, a uniformly distributed \(X\) will become squeezed by some parts of \(g\) and widened in other parts of \(g\).

Why should we use the CDF and not the PDF in Figure 4.34? The advantage of the CDF is that it is an increasing function. Therefore, no matter what the function \(g\) is, the input and the output functions will still be increasing. If we use the PDF, then the non-monotonic behavior of the PDF will interact with another nonlinear function \(g\). It becomes much harder to decouple the two.

We can carry out the integrations to determine \(F_X(g^{-1}(y))\). It can be shown that

$$F_X(g^{-1}(y)) = \int_{-\infty}^{g^{-1}(y)} f_X(x') \;dx',$$

and hence, by the fundamental theorem of calculus, we have

$$\begin{aligned} f_Y(y) = \frac{d}{dy} F_Y(y) = \frac{d}{dy} F_X(g^{-1}(y)) &= \frac{d}{dy} \int_{-\infty}^{g^{-1}(y)} f_X(x') \;dx' \\ &= \left(\frac{d \; g^{-1}(y)}{dy} \right) \cdot f_X(g^{-1}(y)), \end{aligned}$$

where the last step is due to the chain rule. Based on this line of reasoning we can summarize a “recipe” for this problem.

How to find the PDF of \(Y = g(X)\)
  • sep0ex
  • Step 1: Find the CDF \(F_Y(y)\), which is \(F_Y(y) = F_X(g^{-1}(y))\).
  • Step 2: Find the PDF \(f_Y(y)\), which is \(f_Y(y) = \left(\frac{d \; g^{-1}(y)}{dy} \right) \cdot f_X(g^{-1}(y))\).

This recipe works when \(g\) is a one-to-one mapping. If \(g\) is not one-to-one, e.g., \(g(x) = x^2\) implies \(g^{-1}(y) = \pm \sqrt{y}\), then we will have some issues with the above two steps. When this happens, then instead of writing \(X \le g^{-1}(y)\) we need to determine the set \(\{x \;|\; g(x) \le y\}\).

4.7.2Examples

Example 4.26

(Linear transform) Let \(X\) be a random variable with PDF \(f_X(x)\) and CDF \(F_X(x)\). Let \(Y = 2X + 3\). Find \(f_Y(y)\) and \(F_Y(y)\). Express the answers in terms of \(f_X(x)\) and \(F_X(x)\).

Solution

We first note that

$$\begin{aligned} F_Y(y) &= \Pb[Y \le y] \\ &= \Pb[2X + 3 \le y] \\ &= \Pb\left[X \le \frac{y-3}{2}\right] = F_X\left(\frac{y-3}{2}\right). \end{aligned}$$

Therefore, the PDF is

$$\begin{aligned} f_Y(y) &= \frac{d}{dy} F_Y(y) \\ &= \frac{d}{dy} F_X\left( \frac{y-3}{2} \right) \\ &= F_X'\left(\frac{y-3}{2}\right)\frac{d}{dy} \left(\frac{y-3}{2}\right) = \frac{1}{2}f_X\left(\frac{y-3}{2}\right). \end{aligned}$$

Follow-Up. (Linear transformation of a Gaussian random variable). Suppose \(X\) is a Gaussian random variable with zero mean and unit variance, and let \(Y = aX + b\). Then the CDF and PDF of \(Y\) are respectively

$$\begin{aligned} F_Y(y) &= F_X\left(\frac{y-b}{a}\right) = \Phi\left(\frac{y-b}{a}\right),\\ f_Y(y) &= \frac{1}{a}f_X\left(\frac{y-b}{a}\right) = \frac{1}{\sqrt{2\pi}a}e^{-\frac{(y-b)^2}{2a^2}}. \end{aligned}$$

Follow-Up. (Linear transformation of an exponential random variable). Suppose \(X\) is an exponential random variable with parameter \(\lambda\), and let \(Y = aX + b\). Then the CDF and PDF of \(Y\) are respectively

$$\begin{aligned} F_Y(y) &= F_X\left(\frac{y-b}{a}\right) \\ &= 1-e^{- \frac{\lambda}{a}(y-b)}, \qquad y \ge b,\\ f_Y(y) &= \frac{1}{a}f_X\left(\frac{y-b}{a}\right) \\ &= \frac{\lambda }{a}e^{- \frac{\lambda}{a}(y-b)}, \qquad y \ge b. \end{aligned}$$
Example 4.27

Let \(X\) be a random variable with PDF \(f_X(x)\) and CDF \(F_X(x)\). Supposing that \(Y = X^2\), find \(f_Y(y)\) and \(F_Y(y)\). Express the answers in terms of \(f_X(x)\) and \(F_X(x)\).

Solution

We note that

$$\begin{aligned} F_Y(y) = \Pb[Y \le y] = \Pb[X^2 \le y] &= \Pb[-\sqrt{y} \le X \le \sqrt{y}] \\ &= F_X(\sqrt{y}) - F_X(-\sqrt{y}). \end{aligned}$$

Therefore, the PDF is

$$\begin{aligned} f_Y(y) &= \frac{d}{dy} F_Y(y) \\ &= \frac{d}{dy} \left(F_X(\sqrt{y}) - F_X(-\sqrt{y})\right) \\ &= F_X'(\sqrt{y}) \frac{d}{dy} \sqrt{y} - F_X'(-\sqrt{y}) \frac{d}{dy} \left(-\sqrt{y}\right)\\ &= \frac{1}{2\sqrt{y}}\left( f_X(\sqrt{y}) + f_X(-\sqrt{y})\right). \end{aligned}$$
Figure 4.35
Figure 4.35.

Follow Up. (Square of a uniform random variable) Suppose \(X\) is a uniform random variable in \([a,b]\) (assume \(a > 0\)), and let \(Y = X^2\). Then the CDF and PDF of \(Y\) are respectively

$$\begin{aligned} F_Y(y) &= \frac{\sqrt{y}-a}{b-a}, \qquad a^2 \le y \le b^2,\\ f_Y(y) &= \frac{1}{2\sqrt{y}(b-a)}, \qquad a^2 \le y \le b^2. \end{aligned}$$
Example 4.28

Let \(X \sim \mathrm{Uniform}(0,2\pi)\). Suppose \(Y = \cos X\). Find \(f_Y(y)\) and \(F_Y(y)\).

Solution

First, we need to find the CDF of \(X\). This can be done by noting that

$$\begin{aligned} F_X(x) = \int_{-\infty}^{x} f_X(x') \;dx' = \int_{0}^{x} \frac{1}{2\pi} \;dx' = \frac{x}{2\pi}. \end{aligned}$$

Thus, the CDF of \(Y\) is

$$\begin{aligned} F_Y(y) &= \Pb[Y \le y] = \Pb[\cos X \le y] \\ &= \Pb[ \cos^{-1} y \le X \le 2\pi - \cos^{-1} y ] \\ &= F_X(2\pi - \cos^{-1}y ) - F_X(\cos^{-1}y) \\ &= 1 - \frac{\cos^{-1}y}{\pi}. \end{aligned}$$

The PDF of \(Y\) is

$$\begin{aligned} f_Y(y) &= \frac{d}{dy} F_Y(y) = \frac{d}{dy}\left(1 - \frac{\cos^{-1}y}{\pi} \right) \\ &= \frac{1}{\pi\sqrt{1-y^2}}, \end{aligned}$$

where we used the fact that \(\frac{d}{dy}\cos^{-1}y = \frac{-1}{\sqrt{1-y^2}}\).

Example 4.29

Let \(X\) be a random variable with PDF

$$\begin{aligned} f_X(x) = ae^{x}e^{-ae^x}. \end{aligned}$$

Let \(Y = e^X\), and find \(f_Y(y)\).

Solution

We first note that

$$\begin{aligned} F_Y(y) = \Pb[Y \le y] &= \Pb[e^X \le y] \\ &= \Pb[X \le \log y] = \int_{-\infty}^{\log y} a e^x e^{-ae^x} \;dx. \end{aligned}$$

To find the PDF, we recall the fundamental theorem of calculus. This gives us

$$\begin{aligned} f_Y(y) &= \frac{d}{dy} \int_{-\infty}^{\log y} a e^x e^{-ae^x} \;dx \\ &= \left(\frac{d}{dy}\log y\right)\left(\frac{d}{d\log y} \int_{-\infty}^{\log y} a e^x e^{-ae^x} \;dx \right)\\ &= \frac{1}{y} ae^{\log y}e^{-ae^{\log y}} = ae^{-ay}. \end{aligned}$$

Closing remark. The transformation of random variables is a fundamental technique in data science. The approach we have presented is the most rudimentary yet the most intuitive. The key is to visualize the transformation and how the random samples are allocated after the transformation. Note that the density of the random samples is related to the slope of the CDF. Therefore, if the transformation maps many samples to similar values, the slope of the CDF will be steep. Once you understand this picture, the transformation will be a lot easier to understand.

Is it possible to replace the paper-and-pencil derivation of a transformation with a computer? If the objective is to transform random realizations, then the answer is yes because your goal is to transform numbers to numbers, which can be done on a computer. For example, transforming a sample \(x_1\) to \(\sqrt{x_1}\) is straightforward on a computer. However, if the objective is to derive the theoretical expression of the PDF, then the answer is no. Why might we want to derive the theoretical PDF? We might want to analyze the mean, variance, or other statistical properties. We may also want to reverse-engineer and determine a transformation that can yield a specific PDF. This would require a paper-and-pencil derivation. In what follows, we will discuss a handy application of the transformations.

What are the rules of thumb for transformation of random variables?
  • sep0ex
  • Always find the CDF \(F_Y(y) = \Pb[g(X)\le y]\). Ask yourself: What are the values of \(X\) such that \(g(X) \le y\)? Think of the cosine example.
  • Sometimes you do not need to solve for \(F_Y(y)\) explicitly. The fundamental theorem of calculus can help you find \(f_Y(y)\).
  • Draw pictures. Ask yourself whether you need to squeeze or stretch the samples.