(C++) Contribution of Individual Variables to the Dissimilarity Between Two Time Series (Robust Version)
Source:R/RcppExports.R
importance_dtw_cpp.Rd
Computes the contribution of individual variables to the similarity/dissimilarity between two irregular multivariate time series. In opposition to the legacy version, importance computation is performed taking the least-cost path of the whole sequence as reference. This operation makes the importance scores of individual variables fully comparable. This function generates a data frame with the following columns:
variable: name of the individual variable for which the importance is being computed, from the column names of the arguments
x
andy
.psi: global dissimilarity score
psi
of the two time series.psi_only_with: dissimilarity between
x
andy
computed from the given variable alone.psi_without: dissimilarity between
x
andy
computed from all other variables.psi_difference: difference between
psi_only_with
andpsi_without
.importance: contribution of the variable to the similarity/dissimilarity between
x
andy
, computed as(psi_difference * 100) / psi_all
. Positive scores represent contribution to dissimilarity, while negative scores represent contribution to similarity.
Usage
importance_dtw_cpp(
x,
y,
distance = "euclidean",
diagonal = TRUE,
weighted = TRUE,
ignore_blocks = FALSE,
bandwidth = 1
)
Arguments
- x
(required, numeric matrix) multivariate time series.
- y
(required, numeric matrix) multivariate time series with the same number of columns as 'x'.
- distance
(optional, character string) distance name from the "names" column of the dataset
distances
(seedistances$name
). Default: "euclidean".- diagonal
(optional, logical). If TRUE, diagonals are included in the computation of the cost matrix. Default: TRUE.
- weighted
(optional, logical). If TRUE, diagonal is set to TRUE, and diagonal cost is weighted by a factor of 1.414214 (square root of 2). Default: TRUE.
- ignore_blocks
(optional, logical). If TRUE, blocks of consecutive path coordinates are trimmed to avoid inflating the psi distance. Default: FALSE.
- bandwidth
(required, numeric) Size of the Sakoe-Chiba band at both sides of the diagonal used to constrain the least cost path. Expressed as a fraction of the number of matrix rows and columns. Unrestricted by default. Default: 1
See also
Other Rcpp_importance:
importance_dtw_legacy_cpp()
,
importance_ls_cpp()
Examples
#simulate two regular time series
x <- zoo_simulate(
seed = 1,
rows = 100
)
y <- zoo_simulate(
seed = 2,
rows = 150
)
#different number of rows
#this is not a requirement though!
nrow(x) == nrow(y)
#> [1] FALSE
#compute importance
df <- importance_dtw_cpp(
x = x,
y = y,
distance = "euclidean"
)
df
#> variable psi psi_only_with psi_without psi_difference importance
#> 1 a 6.90895 6.968817 6.969328 -0.0005109117 -0.007394926
#> 2 b 6.90895 7.283144 6.930810 0.3523345221 5.099682515
#> 3 c 6.90895 7.214624 6.815044 0.3995803049 5.783516989
#> 4 d 6.90895 6.758557 6.704001 0.0545567459 0.789653200
#> 5 e 6.90895 6.036200 7.133427 -1.0972265521 -15.881234203