Executive Summary

Algorithms

General linear model

  • is a parametric model
  • parameters are estimated to minimize the sum of squared errors in training data
  • requires numeric features
  • does not have any hyperparameters to tune
  • is a natively more interpretable algorithm; More so, if the number of features is low and the features are not highly correlated. Can use the parameter estimates to understand the relative importance of features and the direction of their relationship with the outcome. Interpretation is often improved further by scaling the features.
  • does not natively accommodate interactions between interactions can be added through feature engineering.
  • does not natively include regularization (but see LASSO, Ridge, and GLMNet)
  • variance is relatively unless number of features is high or ratio of features to N is high. Correlations among features (multicollinearity) can also increase variance.
  • bias can be relatively high unless the true DGP is linear on the features. Feature engineering (typically power transformations) can be used to allow the linear model to accommodate simple monotonic relationships.

Typical feature engineering steps include:

  • imputing missing values
  • power transformations of features to allow for non-linear relationships between features and the outcome
  • collapsing infrequent levels of nominal predictors to reduce the number of parameters (if using dummy coding or similar approaches)
  • dummy coding or other methods to accommodate nominal predictors
  • creating selective product features to allow for interactions between features
  • scaling features to make the parameter estimates more interpretabe
  • principal components analysis or similar dimensionality reduction techniques to reduce the number of features and/or multicollinearity among features. However, use of PCA can make the model less interpretable.

KNN

Logistic Regression

LDA and QDA

LASSO, Ridge, and GLMNet

Random Forest

Single hidden layer neural network

Cross Validation

Performance metics

Feature Importance

L1 and L2 Norms