Skip to contents

Internal function to identify the type of response variable. Supported types are:

  • "continuous-binary": decimal numbers and two unique values; results in a warning, as this type is difficult to model.

  • "continuous-low": decimal numbers and 3 to 5 unique values; results in a message, as this type is difficult to model.

  • "continuous-high": decimal numbers and more than 5 unique values.

  • "integer-binomial": integer with 0s and 1s, suitable for binomial models.

  • "integer-binary": integer with 2 unique values other than 0 and 1; returns a warning, as this type is difficult to model.

  • "integer-low": integer with 3 to 5 unique values or meets specified thresholds.

  • "integer-high": integer with more than 5 unique values suitable for count modelling.

  • "categorical": character or factor with 2 or more levels.

  • "unknown": when the response type cannot be determined.

Usage

identify_response_type(df = NULL, response = NULL, quiet = FALSE)

Arguments

df

(required; data frame, tibble, or sf) A data frame with responses and predictors. Default: NULL.

response

(optional; character string or vector) Name/s of response variable/s in df. Used in target encoding when it names a numeric variable and there are categorical predictors, and to compute preference order. Default: NULL.

quiet

(optional; logical) If FALSE, messages generated during the execution of the function are printed to the console Default: FALSE

Value

character string: response type

Examples

identify_response_type(
  df = vi,
  response = "vi_numeric"
)
#> [1] "continuous-high"

identify_response_type(
  df = vi,
  response = "vi_counts"
)
#> [1] "integer-high"

identify_response_type(
  df = vi,
  response = "vi_binomial"
)
#> [1] "integer-binomial"

identify_response_type(
  df = vi,
  response = "vi_categorical"
)
#> [1] "categorical"

identify_response_type(
  df = vi,
  response = "vi_factor"
)
#> [1] "categorical"