Uses `rf_evaluate()`

to compare the performance of several models on independent spatial folds via spatial cross-validation.

```
rf_compare(
models = NULL,
xy = NULL,
repetitions = 30,
training.fraction = 0.75,
metrics = c("r.squared", "pseudo.r.squared", "rmse", "nrmse", "auc"),
distance.step = NULL,
distance.step.x = NULL,
distance.step.y = NULL,
fill.color = viridis::viridis(100, option = "F", direction = -1, alpha = 0.8),
line.color = "gray30",
seed = 1,
verbose = TRUE,
n.cores = parallel::detectCores() - 1,
cluster = NULL
)
```

- models
Named list with models resulting from

`rf()`

,`rf_spatial()`

,`rf_tuning()`

, or`rf_evaluate()`

. Example:`models = list(a = model.a, b = model.b)`

. Default:`NULL`

- xy
Data frame or matrix with two columns containing coordinates and named "x" and "y". Default:

`NULL`

- repetitions
Integer, number of spatial folds to use during cross-validation. Must be lower than the total number of rows available in the model's data. Default:

`30`

- training.fraction
Proportion between 0.5 and 0.9 indicating the proportion of records to be used as training set during spatial cross-validation. Default:

`0.75`

- metrics
Character vector, names of the performance metrics selected. The possible values are: "r.squared" (

`cor(obs, pred) ^ 2`

), "pseudo.r.squared" (`cor(obs, pred)`

), "rmse" (`sqrt(sum((obs - pred)^2)/length(obs))`

), "nrmse" (`rmse/(quantile(obs, 0.75) - quantile(obs, 0.25))`

). Default:`c("r.squared", "pseudo.r.squared", "rmse", "nrmse")`

- distance.step
Numeric, argument

`distance.step`

of`thinning_til_n()`

. distance step used during the selection of the centers of the training folds. These fold centers are selected by thinning the data until a number of folds equal or lower than`repetitions`

is reached. Its default value is 1/1000th the maximum distance within records in`xy`

. Reduce it if the number of training folds is lower than expected.- distance.step.x
Numeric, argument

`distance.step.x`

of`make_spatial_folds()`

. Distance step used during the growth in the x axis of the buffers defining the training folds. Default:`NULL`

(1/1000th the range of the x coordinates).- distance.step.y
Numeric, argument

`distance.step.x`

of`make_spatial_folds()`

. Distance step used during the growth in the y axis of the buffers defining the training folds. Default:`NULL`

(1/1000th the range of the y coordinates).- fill.color
Character vector with hexadecimal codes (e.g. "#440154FF" "#21908CFF" "#FDE725FF"), or function generating a palette (e.g.

`viridis::viridis(100)`

). Default:`viridis::viridis(100, option = "F", direction = -1)`

- line.color
Character string, color of the line produced by

`ggplot2::geom_smooth()`

. Default:`"gray30"`

- seed
Integer, random seed to facilitate reproduciblity. If set to a given number, the results of the function are always the same. Default:

`1`

.- verbose
Logical. If

`TRUE`

, messages and plots generated during the execution of the function are displayed, Default:`TRUE`

- n.cores
Integer, number of cores to use for parallel execution. Creates a socket cluster with

`parallel::makeCluster()`

, runs operations in parallel with`foreach`

and`%dopar%`

, and stops the cluster with`parallel::clusterStop()`

when the job is done. Default:`parallel::detectCores() - 1`

- cluster
A cluster definition generated with

`parallel::makeCluster()`

. If provided, overrides`n.cores`

. When`cluster = NULL`

(default value), and`model`

is provided, the cluster in`model`

, if any, is used instead. If this cluster is`NULL`

, then the function uses`n.cores`

instead. The function does not stop a provided cluster, so it should be stopped with`parallel::stopCluster()`

afterwards. The cluster definition is stored in the output list under the name "cluster" so it can be passed to other functions via the`model`

argument, or using the`%>%`

pipe. Default:`NULL`

A list with three slots:

`comparison.df`

: Data frame with one performance value per spatial fold, metric, and model.`spatial.folds`

: List with the indices of the training and testing records for each evaluation repetition.`plot`

: Violin-plot of`comparison.df`

.

```
if(interactive()){
#loading example data
data(distance_matrix)
data(plant_richness_df)
#fitting random forest model
rf.model <- rf(
data = plant_richness_df,
dependent.variable.name = "richness_species_vascular",
predictor.variable.names = colnames(plant_richness_df)[5:21],
distance.matrix = distance_matrix,
distance.thresholds = 0,
n.cores = 1
)
#fitting a spatial model with Moran's Eigenvector Maps
rf.spatial <- rf_spatial(
model = rf.model,
n.cores = 1
)
#comparing the spatial and non spatial models
comparison <- rf_compare(
models = list(
`Non spatial` = rf.model,
Spatial = rf.spatial
),
xy = plant_richness_df[, c("x", "y")],
metrics = c("r.squared", "rmse"),
n.cores = 1
)
}
```