
Find valid numeric variables in a dataframe
Source:R/identify_numeric_variables.R
identify_numeric_variables.RdIdentifies valid numeric variables and ignores those with constant values.
Usage
identify_numeric_variables(
df = NULL,
responses = NULL,
predictors = NULL,
decimals = 4,
quiet = FALSE,
...
)Arguments
- df
(required; dataframe, tibble, or sf) A dataframe with responses (optional) and predictors. Must have at least 10 rows for pairwise correlation analysis, and
10 * (length(predictors) - 1)for VIF. Default: NULL.- responses
(optional; character, character vector, or NULL) Name of one or several response variables in
df. Default: NULL.- predictors
(required, character vector) Names of the predictors to identify. Default: NULL
- decimals
(required, integer) Number of decimal places for the zero variance test. Smaller numbers will increase the number of variables detected as near-zero variance. Recommended values will depend on the range of the numeric variables in 'df'. Default: 4
- quiet
(optional; logical) If FALSE, messages are printed. Default: FALSE.
- ...
(optional) Internal args (e.g.
function_nameforvalidate_arg_function_name, a precomputed correlation matrixm, or cross-validation args forpreference_order).
Value
list:
valid: character vector with valid numeric predictor names.invalid: character vector with invalid numeric predictor names due to near-zero variance.
Examples
data(vi_smol, vi_predictors)
x <- identify_numeric_variables(
df = vi_smol,
responses = "vi_numeric",
predictors = vi_predictors
)
#valid numeric predictors
x$valid
#> [1] "vi_numeric" "topo_slope"
#> [3] "topo_diversity" "topo_elevation"
#> [5] "swi_mean" "swi_max"
#> [7] "swi_min" "swi_range"
#> [9] "soil_temperature_mean" "soil_temperature_max"
#> [11] "soil_temperature_min" "soil_temperature_range"
#> [13] "soil_sand" "soil_clay"
#> [15] "soil_silt" "soil_ph"
#> [17] "soil_soc" "soil_nitrogen"
#> [19] "solar_rad_mean" "solar_rad_max"
#> [21] "solar_rad_min" "solar_rad_range"
#> [23] "growing_season_length" "growing_season_temperature"
#> [25] "growing_season_rainfall" "growing_degree_days"
#> [27] "temperature_mean" "temperature_max"
#> [29] "temperature_min" "temperature_range"
#> [31] "temperature_seasonality" "rainfall_mean"
#> [33] "rainfall_min" "rainfall_max"
#> [35] "rainfall_range" "evapotranspiration_mean"
#> [37] "evapotranspiration_max" "evapotranspiration_min"
#> [39] "evapotranspiration_range" "cloud_cover_mean"
#> [41] "cloud_cover_max" "cloud_cover_min"
#> [43] "cloud_cover_range" "aridity_index"
#> [45] "humidity_mean" "humidity_max"
#> [47] "humidity_min" "humidity_range"
#invalid due to zero variance (none here)
x$invalid
#> NULL