Preface
1 Introduction
1.1 What is Machine Learning
1.2 Types of Learning
1.2.1 Supervised Learning
1.2.2 Unsupervised Learning
1.2.3 Semi-Supervised Learning
1.2.4 Reinforcement Learning
1.3 How Supervised Learning Works
1.4 Why the Model Works on New Data
2 Notation and Definitions
2.1 Notation
2.1.1 Data Structures
2.1.2 Capital Sigma Notation
2.1.3 Capital Pi Notation
2.1.4 Operations on Sets
2.1.5 Operations on Vectors
2.1.6 Functions
2.1.7 Max and Arg Max
2.1.8 Assignment Operator
2.1.9 Derivative and Gradient
2.2 Random Variable
2.3 Unbiased Estimators
2.4 Bayes’ Rule
2.5 Parameter Estimation
2.6 Parameters vs. Hyperparameters
2.7 Classification vs. Regression
2.8 Model-Based vs. Instance-Based Learning
2.9 Shallow vs. Deep Learning
3 Fundamental Algorithms
3.1 Linear Regression
3.1.1 Problem Statement
3.1.2 Solution
3.2 Logistic Regression
3.2.1 Problem Statement
3.2.2 Solution
3.3 Decision Tree Learning
3.3.1 Problem Statement
3.3.2 Solution
3.4 Support Vector Machine
3.4.1 Dealing with Noise
3.4.2 Dealing with Inherent Non-Linearity
3.5 k-Nearest Neighbors
4 Anatomy of a Learning Algorithm
4.1 Building Blocks of a Learning Algorithm
4.2 Gradient Descent
4.3 How Machine Learning Engineers Work
4.4 Learning Algorithms’ Particularities
5 Basic Practice
5.1 Feature Engineering
5.1.1 One-Hot Encoding
5.1.2 Binning
5.1.3 Normalization
5.1.4 Standardization
5.1.5 Dealing with Missing Features
5.1.6 Data Imputation Techniques
5.2 Learning Algorithm Selection
5.3 Three Sets
5.4 Underfitting and Overfitting
5.5 Regularization
5.6 Model Performance Assessment
5.6.1 Confusion Matrix
5.6.2 Precision/Recall
5.6.3 Accuracy
5.6.4 Cost-Sensitive Accuracy
5.6.5 Area under the ROC Curve (AUC)
5.7 Hyperparameter Tuning
5.7.1 Cross-Validation
6 Neural Networks and Deep Learning
6.1 Neural Networks
6.1.1 Multilayer Perceptron Example
6.1.2 Feed-Forward Neural Network Architecture
6.2 Deep Learning
6.2.1 Convolutional Neural Network
6.2.2 Recurrent Neural Network
7 Problems and Solutions
7.1 Kernel Regression
7.2 Multiclass Classification
7.3 One-Class Classification
7.4 Multi-Label Classification
7.5 Ensemble Learning
7.5.1 Boosting and Bagging
7.5.2 Random Forest
7.5.3 Gradient Boosting
7.6 Learning to Label Sequences
7.7 Sequence-to-Sequence Learning
7.8 Active Learning
7.9 Semi-Supervised Learning
7.10 One-Shot Learning
7.11 Zero-Shot Learning
8 Advanced Practice
8.1 Handling Imbalanced Datasets
8.2 Combining Models
8.3 Training Neural Networks
8.4 Advanced Regularization
8.5 Handling Multiple Inputs
8.6 Handling Multiple Outputs
8.7 Transfer Learning
8.8 Algorithmic Efficiency
9 Unsupervised Learning
9.1 Density Estimation
9.2 Clustering
9.2.1 K-Means
9.2.2 DBSCAN and HDBSCAN
9.2.3 Determining the Number of Clusters
9.2.4 Other Clustering Algorithms
9.3 Dimensionality Reduction
9.3.1 Principal Component Analysis
9.3.2 UMAP
9.4 Outlier Detection
10 Other Forms of Learning
10.1 Metric Learning
10.2 Learning to Rank
10.3 Learning to Recommend
10.3.1 Factorization Machines
10.3.2 Denoising Autoencoders
10.4 Self-Supervised Learning: Word Embeddings
11 Conclusion
11.1 Topic Modeling
11.2 Gaussian Processes
11.3 Generalized Linear Models
11.4 Probabilistic Graphical Models
11.5 Markov Chain Monte Carlo
11.6 Genetic Algorithms
11.7 Reinforcement Learning
Index
