Machine Learning
Welcome to the introductory course to Classical Machine Learning. This course is designed for complete beginners who have prior experience with Python. No experience with any ML technique or library is required.

Most algorithms covered here will only require built-in data types (like list
, dict
, etc.) and functions of Python. All the techniques will be taught through hands-on exercises and challenges.
Tools
Numpy
Pandas
Matplotlib
Scikit learn
Evaluation Metrics
Accuracy
F1,
AUC
Precision, Recall → True/False - Positive/Negative
Sensitivity VS Specificity
Confusion Matrix
…
Concepts
Underfitting/overfitting ⇒ bias VS variance
Initialization techniques
Hyperparameter tuning
Optimizers (Gradient descent, GD with momentum, RMSProp, Adam)
Regularization (L1, L2 norms, dropout, early stopping)
Normalization
Cross-Validation
Chapters
Introduction to predictive algorithms
Majority class/frequency baseline for classification
Mean/median baseline for regression
Basic feature engineering
Numerical transformations
KNN
KNN Classification
Distance/Similarity metrics (L1 vs L2 distances)
Multi-dimensional cases
Weighted KNN
KNN Regression
Data Handling
Different features have different magnitudes
Normalization
Feature Engineering
K-Means Clustering
Supervised VS unsupervised training
Initialization: multiple restarts
Handling outliers
Evaluation
Train set VS Test set
Validation Accuracy
Cross-Validation
Underfitting/overfitting
Hyperparameter tuning: picking the best
k
Linear Regression
Simple case
Multivariate regression
Logistic Regression
Classification problems
Class imbalance → fix with adjusting the loss function (maybe also talk about under-/over-sampling)
Lasso, Ridge: L1, L2 regularization
ML workflows and data leakage
Train/validation/test splits vs cross-validation
Stratified and group splits
Leakage pitfalls
Reproducibility
Decision tree classification
Visual intuition
Gini index
Entropy
Heuristics to prune the tree
Naive Bayes
Bayes' theorem
Bag of words
Spam detection
Polynomial Regression
Relationship to linear regression
Choosing the degree
Regularization for polynomial regression
Random forest
Relationship to decision trees
Picking the number of estimators
Support Vector Machines (SVM)
Support vectors
Cross Validation for SVM
Ensemble of models
Boosting VS Bagging
Random Forests as an ensemble of models
Dimensionality reduction
Curse of dimensionality
PCA
Wrap up with mini-projects
Custom datasets
Resources