Power Spectral Density
Beginning with this section we are going to focus on WSS processes. By WSS, we mean that the autocorrelation function \(R_X(t_1,t_2)\) has a Toeplitz structure. Putting it in other words, we assume \(R_X(t_1,t_2)\) can be simplified to \(R_X(\tau)\), where \(\tau = t_1 - t_2\). We call this property time invariance.
10.4.1Basic concepts
Assuming that \(R_X(\tau)\) is square integrable, i.e., \(\int_{-\infty}^{\infty} R_X(\tau)^2\; d\tau < \infty\), we can now define the Fourier transform of \(R_X(\tau)\) which is called the power spectral density.
The power spectral density \(S_X(\omega)\) of a WSS process is
assuming that \(\int_{-\infty}^{\infty} R_X(\tau)^2\; d\tau < \infty\) so that the Fourier transform of \(R_X(\tau)\) exists.
Let \(R_X(\tau)=e^{-2\alpha |\tau|}\). Find \(S_X(\omega)\).
Using the Fourier transform table,
Figure 10.17 shows \(R_X(\tau)\) and \(S_X(\omega)\).


Why is Theorem thm: ch10 Wie a theorem rather than a definition? This is because power spectral density has its definition. There is no way that you can get any “power” information merely by looking at the Fourier transform of \(R_X(\tau)\). We will discuss the origin of the power spectral density later, but for now, we only need to know that \(S_X(\omega)\) is the Fourier transform of \(R_X(\tau)\).
Remark. The power spectral density is defined for WSS processes. If the process is not WSS, then \(R_X\) will be a 2D function instead of a 1D function of \(\tau\), so we cannot take the Fourier transform in \(\tau\). We will discuss this in detail shortly.
Let \(X(t)=a\cos(\omega_0 t+\Theta), \; \Theta\sim\) Uniform\([0,2\pi]\). Find \(S_X(\omega)\).
We know that the autocorrelation function is
By taking the Fourier transform of both sides, we have
The result is shown in Figure 10.18.


Let \(S_X(\omega)=\frac{N_0}{2} \textrm{rect}(\frac{\omega}{2W})\). Find \(R_X(\tau)\).
Since \(S_X(\omega) = \calF(R_X(\tau))\), the inverse holds:
This example shows what we call the bandlimited white noise. The power spectral density \(S_X(\omega)\) is uniform, meaning that it covers all frequencies (or wavelengths in optics). It is called “white noise” because white light is essentially a mixture of all wavelengths.
The bandwidth of the power spectral density \(W\) defines the zero crossings of \(R_X(\tau)\). It is easy to show that when \(W \rightarrow \infty\), \(R_X(\tau)\) converges to a delta function. This happens when \(X(t)\) is i.i.d. Gaussian. Therefore, the pure Gaussian noise random process is also known as the white noise process. Reshaping the i.i.d. Gaussian noise to an arbitrary power spectral density can be done by passing it through a linear filter, as we will explain later.


Finding \(S_X(\omega)\) from \(R_X(\tau)\) is straightforward, at least in principle. The more interesting questions to ask are: (1) Why do we need to learn about power spectral density? (2) Why do we need WSS to define power spectral density?
- sep0ex
- Power spectral densities are useful when we pass a random process through some linear operations, e.g., convolution, running average, or running difference.
- Power spectral densities are the Fourier transforms of the autocorrelation functions. Fourier transforms are useful for speeding up computation and drawing random samples from a given power spectral density.
A random process itself is not interesting until we process it; there are many ways to do this. The most basic operation is to send the random process through a linear time-invariant system, e.g., a convolution. Convolution is equivalent to filtering the random process. For example, if the input process contains noise, we can design a linear time-invariant filter to denoise the random process. The power spectral density, which is the Fourier transform of the autocorrelation function, makes the filtering easier because everything can be done in the spectral (Fourier) domain. Moreover, we can analyze the performance and quantify the limit using standard results in Fourier analysis. For some specialized problems such as imaging through atmospheric turbulence, the distortions happen in the phase domain. This can be simulated by drawing samples from the power spectral density, e.g., the Kolmogorov spectrum or the von K\'arm\'an spectrum. Power spectral densities have many important engineering applications.
- sep0ex
- If a process is WSS, then \(R_X\) will have a Toeplitz structure.
- A Toeplitz matrix is important. If you do eigendecomposition of a Toeplitz matrix, the eigenvectors are the Fourier bases.
- So if \(R_X\) is Toeplitz, then you can diagonalize it using the Fourier transform.
- Therefore, the power spectral density can be defined.
Why does power spectral density require WSS? This has to do with the Toeplitz structure of the autocorrelation function. To make our discussion easier let us discretize the autocorrelation function \(R_X(t_1,t_2)\) by considering \(R_X[m,n]\). (You can do a mental calculation by converting \(t_1\) to integer indices \(m\), and \(t_2\) to \(n\). See any textbook on signals and systems if you need help. This is called the “discrete-time signal”.) Following the range of \(t_1\) and \(t_2\), \(R_X[m,n]\) can be expressed as:
where we used the fact that \(R_X[m,n] = R_X[m-n]\) for WSS processes and \(R_X[k] = R_X[-k]\). We call the resulting matrix \(\mR\) the autocorrelation matrix, which is a discretized version of the autocorrelation function \(R_X(t_1,t_2)\). Looking at \(\mR\), we again observe the Toeplitz structure. For example, Figure 10.20 shows one Toeplitz structure and one non-Toeplitz structure.

Any Toeplitz matrix \(\mR\) can be diagonalized using the Fourier transforms. That is, we can write \(\mR\) as
where \(\mF\) is the (discrete) Fourier transform matrix and \(\mLambda\) is a diagonal matrix. This can be understood as the eigendecomposition of \(\mR\). The important point here is that only Toeplitz matrices can be eigendecomposed using the Fourier transform; an arbitrary symmetric matrix cannot. Figure 10.20 illustrates this point. If your matrix is Toeplitz, you can diagonalize it, and hence you can define the power spectral density, just as in the first example. If your matrix is not Toeplitz, then the power spectral density is undefined. To get the Toeplitz matrix, you must start with a WSS process.
Before moving on, we define cross power spectral densities, which will be useful in some applications.
The cross power spectral density between two random processes \(X(t)\) and \(Y(t)\) is
Remark. In general, \(S_{X,Y}(\omega) \neq S_{Y,X}(\omega)\). Rather, since \(R_{X,Y}(\tau)=R_{Y,X}(-\tau)\), we have \(S_{X,Y}(\omega) = \overline{S_{Y,X}(\omega)}\).
10.4.2Origin of the power spectral density
To understand the power spectral density, it is crucial to understand where it comes from and why it is the Fourier transform of the autocorrelation function.
We begin by assuming that \(X(t)\) is a WSS random process with mean \(\mu_X\) and autocorrelation \(R_X(\tau)\). We now consider the notion of power. Consider a random process \(X(t)\). The power within a period \([-T,T]\) is
\(\widehat{P}_X\) defines the power because the integration alone is the energy, and the normalization by \(1/2T\) gives us the power. However, there are two problems. First, since \(X(t)\) is random, the power \(\widehat{P}_X\) is also random. Is there a way we can eliminate the randomness? Second, \(T\) is a finite period of time. It does not capture the entire process, and so we do not know the power of the entire process.
A natural solution to these two problems is to consider
Here, we take the limit of \(T\) to infinity so that we can compute the power of the entire process. We also take the expectation to eliminate the randomness. Therefore, \(P_X\) can be regarded as the average power of the complete random process \(X(t)\).
Next, we need one definition and one lemma. The definition defines \(S_X(\omega)\), and the lemma will link \(S_X(\omega)\) with the power \(P_X\).
The power spectral density (PSD) of a WSS process is defined as
where
is the Fourier transform of \(X(t)\) limited to \([-T,T]\).
This definition is abstract, but in a nutshell, it simply considers everything in the Fourier domain. The ratio \(|\widetilde{X}_T(\omega)|^2/2T\) is the power, but in the frequency domain. The reason is that if \(X(t)\) is Fourier transformable, then Parseval's theorem will hold. Parseval's theorem states that energy in the original space is conserved in the Fourier space. Since the ratio \(|\widetilde{X}_T(\omega)|^2/2T\) is the energy divided by time, it is the power. However, this is still not enough to help us understand power spectral density: We need a lemma.
Define
Then
The lemma has to be read together with the previous definition. If we can prove the lemma, we know that by integrating \(S_X(\omega)\) we will obtain the power. Therefore, \(S_X(\omega)\) can be viewed as a density function, specifically the density function of the power. \(S_X(\omega)\) is called the power spectral density because everything is defined in the Fourier domain. Putting this all together gives us “power spectral density”.
Proof. First, we recall that \(P_X\) is the expectation of the average power of \(X(t)\). Let
It follows that integrating over \(-\infty\) to \(\infty\) is equivalent to
By Parseval's theorem, energy is conserved in both the time and the frequency domain:
Therefore, \(P_X\) satisfies
The power spectral densities are functions whose integrations give us the power. If we want to determine the power of a random process, the Einstein-Wiener-Khinchin theorem (Theorem thm: ch10 Wie) says that \(S_X(\omega)\) is just the Fourier transform of \(R_X(\tau)\):
The proof of the Einstein-Wiener-Khinchin theorem is quite intricate, so we defer the proof to the Appendix. The significance of the theorem is that it turns an abstract quantity, the power spectral density, into a very easily computable quantity, namely the Fourier transform of the autocorrelation function. For now, we will happily use this theorem because it saves us a great deal of trouble when we want to determine the power spectral density from first principles.