Table of Content
1.1 Infinite series
1.1.1 Geometric series
1.1.2 Binomial Series
1.2 Approximation
1.3 Integration
1.4 Linear Algebra
1.4.1 Why do we need linear algebra in data science?
1.4.2 Everything you need to know about linear algebra
1.4.3 Inner products and norms
1.4.4 Matrix calculus
1.5 Basic Combinatorics
1.5.1 Birthday paradox
1.5.2 Permutation
1.5.3 Combination
3.1 Random Variables
3.1.1 A motivating example
3.1.2 Definition of a random variable
3.1.3 Probability measure on random variables
3.2 Probability Mass Function
3.2.1 Definition
3.2.2 PMF and probability measure
3.2.3 Normalization Property
3.2.4 PMF vs Histogram
3.2.5 Estimating histograms from real data
3.3 Cumulative Distribution Function
3.4 Expectation
3.4.1 Definition
3.4.2 Existence of Expectation
3.4.3 Properties of Expectation
3.4.4 Moments and Variance
3.5 Common Discrete Random Variables
3.5.1 Bernoulli Random Variable
3.5.2 Binomial random variable
3.5.3 Geometric random variable
3.5.4 Poisson random variable
4.1 Probability Density Function
4.1.1 Some intuition about probability density functions
4.1.2 More in-depth discussion about PDFs
4.1.3 Connecting with PMF
4.2 Expectation, Moment, and Variance
4.2.1 Definition and properties
4.2.2 Existence of Expectation
4.2.3 Moment and Variance
4.3 Cumulative Distribution Function
4.3.1 CDF for continuous random variables
4.3.2 Properties of CDF
4.3.3 Retrieving PDF from CDF
4.3.4 CDF: Unifying discrete and continuous random variables
4.4 Median, Mode, and Mean
4.4.1 Median
4.4.2 Mode
4.4.3 Mean
4.5 Uniform and Exponential Random Variables
4.5.1 Uniform Random Variable
4.5.2 Exponential Random Variable
4.5.3 Origin of exponential random variable
4.5.4 Applications of exponential random variables
4.6 Gaussian Random Variables
4.6.1 Definition of a Gaussian random variable
4.6.2 Standard Gaussian
4.6.3 Skewness and Kurtosis
4.6.4 Origin of Gaussian random variables
4.7 Functions of Random Variables
4.7.1 General principle
4.7.2 Worked examples
4.8 Generating Random Numbers
4.8.1 Principle
4.8.2 Examples
5.1 Joint PMF and Joint PDF
5.1.1 Probability measure in 2D
5.1.2 Discrete random variables
5.1.3 Continuous random variables
5.1.4 Normalization
5.1.5 Marginal PMF and marginal PDF
5.1.6 Independent random variables
5.1.7 Joint CDF
5.2 Joint Expectation
5.2.1 Definition and interpretation
5.2.2 Covariance and correlation coeffcient
5.2.3 Independence and correlation
5.2.4 Computing correlation from data
5.3 Conditional PMF and PDF
5.3.1 Conditional PMF
5.3.2 Conditional PDF
5.4 Conditional Expectation
5.5 Sum of Two Random Variables
5.6 Random Vector and Covariance Matrices
5.6.1 PDF of random vectors
5.6.2 Expectation of random vectors
5.6.3 Covariance matrix
5.6.4 Multi-dimensional Gaussian
5.7 Transformaiton of Multi-dimensional Gaussian
5.7.1 Linear transformation of mean and covariance
5.7.2 Eigenvalues and eigenvectors
5.7.3 Covariance matrices are always positive semi-definite
5.7.4 Gaussian whitening
5.8 Principal Component Analysis
5.8.1 The main idea: Eigen-decomposition
5.8.2 The Eigenface problem
5.8.3 What cannot be analyzed by PCA?
6.1 Moment Generating and Characteristic Functions
6.1.1 Moment Generating Function
6.1.2 Sum of independent variables via MGF
6.1.3 Characteristic Functions
6.2 Probability Inequalities
6.2.1 Union bound
6.2.2 Cauchy-Schwarz's inequality
6.2.3 Jensen's inequality
6.2.4 Markov's inequality
6.2.5 Chebyshev's inequality
6.2.6 Chernoff's bound
6.2.7 Comparing Chernoff and Chebyshev
6.2.8 Hoeffding's inequality
6.3 Law of Large Numbers
6.3.1 Sample average
6.3.2 Weak law of large numbers (WLLN)
6.3.3 Convergence in probability
6.3.4 Can we prove WLLN using Chernoff's bound?
6.3.5 Does weak of large numbers always hold?
6.3.6 Strong law of large numbers
6.3.7 Almost sure convergence
6.3.8 Proof of strong law of large numbers
6.4 Central Limit Theorem
6.4.1 Convergence in distribution
6.4.2 Central Limit Theorem
6.4.3 Examples
6.4.4 Limitation of the Central Limit Theorem
8.1 Maximum-Likelihood Estimation
8.1.1 Likelihood function
8.1.2 Maximum-likelihood estimate
8.1.3 Application 1: Social network analysis
8.1.4 Application 2: Reconstructing images
8.1.5 More examples on ML estimation
8.1.6 Regression vs ML estimation
8.2 Properties of ML Estimates
8.2.1 Estimators
8.2.2 Unbiased estimators
8.2.3 Consistent estimators
8.2.4 Invariance principle
8.3 Maximum-A-Posteriori Estimation
8.3.1 The trio of likelihood, prior, and posterior
8.3.2 Understanding the priors
8.3.3 MAP formulation and solution
8.3.4 Analyzing the MAP solution
8.3.5 Analysis of the posterior distribution
8.3.6 Conjugate Prior
8.3.7 Linking MAP with regression
8.4 Mean-Square Error Estimation
8.4.1 Positioning the mean square error estimation
8.4.2 Mean square error
8.4.3 MMSE solution = conditional expectation
8.4.4 MMSE estimator for multi-dimensional Gaussian
8.4.5 Linking MMSE and neural networks
10.1 Basic Concepts
10.2 Mean and Correlation Functions
10.3 Wide Sense Stationary Processes
10.4 Power Spectral Density
10.5 WSS Process through LTI Systems
10.5.1 Review of a linear time-invariant (LTI) system
10.5.2 Mean and autocorrelation through LTI Systems
10.5.3 Power spectral density through LTI systems
10.5.4 Cross-correlation through LTI Systems
10.6 Optimal Linear Filter
10.6.1 Discrete-time random processes
10.6.2 Problem formulation
10.6.3 Yule-Walker equation
10.6.4 Linear prediction
10.6.5 Wiener Filter
|