Infinite Series
Imagine that you have a fair coin. If you get a tail, you flip it again. You do this repeatedly until you finally get a head. What is the probability that you need to flip the coin three times to get one head?
This is a warm-up exercise. Since the coin is fair, the probability of obtaining a head is \(\frac{1}{2}\). The probability of getting a tail followed by a head is \(\frac{1}{2} \times \frac{1}{2} = \frac{1}{4}\). Similarly, the probability of getting two tails and then a head is \(\frac{1}{2} \times \frac{1}{2} \times \frac{1}{2} = \frac{1}{8}\). If you follow this logic, you can write down the probabilities for all other cases. For your convenience, we have drawn the first few in Figure 1.1. As you have probably noticed, the probabilities follow the pattern \(\{\frac{1}{2},\frac{1}{4},\frac{1}{8},\ldots\}\).

We can also summarize these probabilities using a familiar plot called the histogram as shown in Figure 1.2. The histogram for this problem has a special pattern: each bar is half the height of the preceding one, and the sequence of bars extends infinitely.

Let us ask something harder: On average, if you want to be 90% sure that you will get a head, what is the minimum number of attempts you need to try? Five attempts? Ten attempts? Indeed, if you try ten attempts, you will very likely accomplish your goal. However, this would seem to be overkill. If you try five attempts, then it becomes unclear whether you will be 90% sure.
This problem can be answered by analyzing the sequence of probabilities. If we make two attempts, then the probability of getting a head is the sum of the probabilities for one attempt and that of two attempts:
Therefore, if you make 3 attempts or 4 attempts, you get the following probabilities:
So if we try four attempts, we will have a 93.75% probability of getting a head. Thus, four attempts is the answer.
The MATLAB / Python codes we used to generate Figure 1.2 are shown below.
% MATLAB code to generate a geometric sequence
p = 1/2;
n = 1:10;
X = p.^n;
bar(n,X,'FaceColor',[0.8, 0.2,0.2]);# Python code to generate a geometric sequence
import numpy as np
import matplotlib.pyplot as plt
p = 1/2
n = np.arange(1,10)
X = np.power(p,n)
plt.bar(n,X); plt.show()This warm-up exercise has perhaps raised some of your interest in the subject. However, we will not tell you everything now. We will come back to this in Chapter 3 when we discuss geometric random variables. In the present section, we want to make sure you have the basic mathematical tools to calculate quantities, such as a sum of fractional numbers. For example, what if we want to calculate \(\Pb[\text{success after 107 attempts}]\)? Is there a systematic way of performing the calculation?
Remark. You should be aware that the 93.75% only says that the probability of achieving the goal is high. If you have a bad day, you may still need more than four attempts. Therefore, when we stated the question, we asked for 90% “on average”. Sometimes you may need more attempts and sometimes fewer, but you have a 93.75% probability of success.
1.1.1Geometric Series
A geometric series is the sum of a finite or an infinite sequence of numbers with a constant ratio between successive terms. As we have seen in the previous example, a geometric series appears naturally in the context of discrete events. In Chapter 3 of this book, we will use geometric series when calculating the expectation and moments of a random variable.
Let \(0 < r < 1\). A finite geometric sequence of power \(n\) is a sequence of numbers $$\bigg\{1,r,r^2,\ldots,r^n\bigg\}.$$ An infinite geometric sequence is a sequence of numbers $$\bigg\{1,r,r^2,r^3,\ldots\bigg\}.$$
The sum of a finite geometric series of power \(n\) is
Proof. We multiply both sides by \(1-r\). The left-hand side becomes
where \((a)\) follows because consecutive terms cancel telescopically.
■A corollary of Eq. (1.1) gives the sum of an infinite geometric series.
Let \(0 < r < 1\). The sum of an infinite geometric series is
Proof. We take the limit in Eq. (1.1). This yields
Remark. Note that the condition \(0 < r < 1\) is important. If \(r > 1\), then \(r^{n+1} \to \infty\) as \(n \to \infty\), so the series diverges. The constant \(r\) cannot equal \(1\), for otherwise the fraction \((1-r^{n+1})/(1-r)\) is undefined. We are not interested in the case when \(r = 0\), because the sum is trivially 1: \(\sum_{k=0}^{\infty} 0^k = 1 + 0^1 + 0^2 + \cdots = 1\).
Compute the infinite series \(\sum\limits_{k=2}^\infty \frac{1}{2^k}\).
Remark. You should not confuse a geometric series with a harmonic series. A harmonic series concerns the sum of \(\{1,\frac{1}{2},\frac{1}{3},\frac{1}{4},\ldots\}\). It turns out that (This result can be found in Tom Apostol, Mathematical Analysis, 2nd Edition, Theorem 8.11.)
On the other hand, a squared harmonic series \(\{1,\frac{1}{2^2},\frac{1}{3^2},\frac{1}{4^2},\ldots\}\) converges:
The latter result is known as the Basel problem.
We can extend the main theorem by considering more complicated series, for example, the following.
Let \(0 < r < 1\). It holds that
Proof. Take the derivative of both sides of Eq. (1.2). The left-hand side becomes
The right-hand side becomes \(\displaystyle \frac{d}{dr} \left(\frac{1}{1-r}\right) = \frac{1}{(1-r)^2}\).
■Compute the infinite sum \(\sum_{k=1}^{\infty} k \cdot \frac{1}{3^k}\).
We can use the derivative result:
1.1.2Binomial Series
A geometric series is useful when handling situations such as \(N-1\) failures followed by a success. However, we can easily twist the problem by asking: What is the probability of getting one head out of 3 independent coin tosses? In this case, the probability can be determined by enumerating all possible cases:
Figure 1.3 illustrates the situation.

What lessons have we learned in this example? Notice that you need to enumerate all possible combinations of one head and two tails to solve this problem. The number is 3 in our example. In general, the number of combinations can be systematically studied using combinatorics, which we will discuss later in the chapter. However, the number of combinations motivates us to discuss another background technique known as the binomial series. The binomial series is instrumental in algebra when handling polynomials such as \((a+b)^2\) or \((1+x)^3\). It provides a valuable formula when computing these powers.
For any real numbers \(a\) and \(b\) and any non-negative integer \(n\), the binomial expansion of power \(n\) is
where \({n \choose k} = \frac{n!}{k!(n-k)!}\).
The binomial theorem is valid for any real numbers \(a\) and \(b\) and any non-negative integer \(n\). The quantity \({n \choose k}\) reads as “\(n\) choose \(k\)”. Its definition is $${n \choose k} \bydef \frac{n!}{k!(n-k)!},$$ where \(n! = n(n-1)(n-2)\cdots 3 \cdot 2 \cdot 1\). We shall discuss the physical meaning of \({n \choose k}\) in Section sec: ch1 combination. But we can quickly plug in the “\(n\) choose \(k\)” into the coin flipping example by letting \(n = 3\) and \(k = 1\):
So you can see why we want you to spend your precious time learning about the binomial theorem. In MATLAB and Python, \({n \choose k}\) can be computed using the commands as follows.
% MATLAB code to compute (N choose K) and K!
n = 10;
k = 2;
nchoosek(n,k)
factorial(k)# Python code to compute (N choose K) and K!
from scipy.special import comb, factorial
n = 10
k = 2
comb(n, k)
factorial(k)The binomial theorem makes the most sense when we also learn about Pascal's identity.
Let \(n\) and \(k\) be positive integers such that \(k \le n\). Then,
Proof. We start by recalling the definition of \({n \choose k}\). This gives us
where we factor out \(n!\) to obtain the second equation. Next, we observe that
Substituting into the previous equation we obtain
Pascal's triangle is a visualization of the coefficients of \((a+b)^n\) as shown in Figure 1.4. For example, when \(n = 5\), we know that \({5 \choose 3} = 10\). However, by Pascal's identity, we know that \({5 \choose 3} = {4 \choose 2} + {4 \choose 3}\). So the number 10 is actually obtained by summing the numbers 4 and 6 of the previous row.

Find \((1+x)^3\).
Using the binomial theorem, we can show that
Let \(0 < p <1\). Find $$\displaystyle\sum_{k=0}^n {n \choose k} p^{n-k}(1-p)^k.$$
By using the binomial theorem, we have
This result will be helpful when evaluating binomial random variables in Chapter 3.
We now prove the binomial theorem. Please feel free to skip the proof if this is your first time reading the book.
Proof of the binomial theorem. We prove by induction. When \(n = 1\),
Therefore, the base case is verified. Assume up to case \(n\). We need to verify case \(n+1\).
We want to apply Pascal's identity to combine the two terms. In order to do so, we note that the second term in this sum can be rewritten as
The first term in the sum can be written as
where the \(k=0\) term is separated as \(a^{n+1}\). Therefore, the two terms can be combined using Pascal's identity to yield
Hence, the (\(n+1\))th case is also verified. By the principle of mathematical induction, we have completed the proof.
■The end of the proof. Please join us again.