Variance Inflation Factor

Computes the Variance Inflation Factor of numeric variables in a data frame.

This function computes the VIF (see section Variance Inflation Factors below) in two steps:

Applies base::solve() to obtain the precision matrix, which is the inverse of the covariance matrix between all variables in predictors.
Uses base::diag() to extract the diagonal of the precision matrix, which contains the variance of the prediction of each predictor from all other predictors, and represents the VIF.

Usage

vif_df(df = NULL, predictors = NULL, quiet = FALSE)

Arguments

df: (required; data frame, tibble, or sf) A data frame with responses and predictors. Default: NULL.
predictors: (optional; character vector) Names of the predictors to select from df. If omitted, all numeric columns in df are used instead. If argument response is not provided, non-numeric variables are ignored. Default: NULL
quiet: (optional; logical) If FALSE, messages generated during the execution of the function are printed to the console Default: FALSE

Value

data frame; predictors names their VIFs

Variance Inflation Factors

The Variance Inflation Factor for a given variable \(a\) is computed as \(1/(1-R2)\), where \(R2\) is the multiple R-squared of a multiple regression model fitted using \(a\) as response and all other predictors in the input data frame as predictors, as in \(a = b + c + ...\).

The square root of the VIF of \(a\) is the factor by which the confidence interval of the estimate for \(a\) in the linear model \(y = a + b + c + ...\)` is widened by multicollinearity in the model predictors.

The range of VIF values is (1, Inf]. The recommended thresholds for maximum VIF may vary depending on the source consulted, being the most common values, 2.5, 5, and 10.

References

David A. Belsley, D.A., Kuh, E., Welsch, R.E. (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. John Wiley & Sons. DOI: 10.1002/0471725153.

Examples