Aggregate distantia()
Data Frames Across Parameter Combinations
Source: R/distantia_aggregate.R
distantia_aggregate.Rd
The function distantia()
allows dissimilarity assessments based on several combinations of arguments at once. For example, when the argument distance
is set to c("euclidean", "manhattan")
, the output data frame will show two dissimilarity scores for each pair of compared time series, one based on euclidean distances, and another based on manhattan distances.
This function computes dissimilarity stats across combinations of parameters.
If psi scores smaller than zero occur in the aggregated output, then the the smaller psi value is added to the column psi
to start dissimilarity scores at zero.
If there are no different combinations of arguments in the input data frame, no aggregation happens, but all parameter columns are removed.
Arguments
- df
(required, data frame) Output of
distantia()
,distantia_ls()
,distantia_dtw()
, ordistantia_time_delay()
. Default: NULL- f
(optional, function) Function to summarize psi scores (for example,
mean
) when there are several combinations of parameters indf
. Ignored when there is a single combination of arguments in the input. Default:mean
- ...
(optional, arguments of
f
) Further arguments to pass to the functionf
.
See also
Other distantia_support:
distantia_boxplot()
,
distantia_cluster_hclust()
,
distantia_cluster_kmeans()
,
distantia_matrix()
,
distantia_model_frame()
,
distantia_spatial()
,
distantia_stats()
,
distantia_time_delay()
,
utils_block_size()
,
utils_cluster_hclust_optimizer()
,
utils_cluster_kmeans_optimizer()
,
utils_cluster_silhouette()
Examples
#three time series
#climate and ndvi in Fagus sylvatica stands in Spain, Germany, and Sweden
tsl <- tsl_initialize(
x = fagus_dynamics,
name_column = "name",
time_column = "time"
) |>
tsl_transform(
f = f_scale_global
)
if(interactive()){
tsl_plot(
tsl = tsl,
guide_columns = 3
)
}
#distantia with multiple parameter combinations
#-------------------------------------
df <- distantia(
tsl = tsl,
distance = c("euclidean", "manhattan"),
lock_step = TRUE
)
df[, c(
"x",
"y",
"distance",
"psi"
)]
#> x y distance psi
#> 2 Germany Sweden euclidean 0.8576700
#> 5 Germany Sweden manhattan 0.8591195
#> 4 Germany Spain manhattan 1.2698922
#> 1 Germany Spain euclidean 1.3061327
#> 3 Spain Sweden euclidean 1.4708497
#> 6 Spain Sweden manhattan 1.4890286
#aggregation using means
df <- distantia_aggregate(
df = df,
f = mean
)
df
#> x y psi
#> 2 Germany Sweden 0.8576700
#> 5 Germany Sweden 0.8591195
#> 4 Germany Spain 1.2698922
#> 1 Germany Spain 1.3061327
#> 3 Spain Sweden 1.4708497
#> 6 Spain Sweden 1.4890286