This function computes an approximation to the time-delay between pairs of time series as the difference between observations connected by the dynamic time warping path.
Given a pair of time series x and y, and the time of their samples in the dynamic time warping time(x) and time(y), when the argument directional is TRUE, the time delay is computed as follows:
Time delay from
xtoy:time(y) - time(x).Time delay from
ytox:time(x) - time(y)
In such case, two rows per pair of time series are returned. Otherwise, the time delay is computed as abs(time(y) - time(x)), and only one row per pair of time series is returned.
If the time series have more than 30 observations, 5% of cases are omitted at each extreme of the warping path to avoid overestimating time delays due to early misalignments.
The function returns a data frame with the names of the time series in columns x and y, and summary statistics of the time delay. The mode and median are generally the most accurate time-delay metrics.
This function requires scaled and detrended time series. It may yield non-sensical results in case of degenerate warping paths. Plotting dubious results with distantia_dtw_plot() is a good approach to identify these cases.
Usage
distantia_time_delay(
tsl = NULL,
distance = "euclidean",
bandwidth = 1,
directional = FALSE
)Arguments
- tsl
(required, time series list) list of zoo time series. Default: NULL
- distance
(optional, character vector) name or abbreviation of the distance method. Valid values are in the columns "names" and "abbreviation" of the dataset distances. Default: "euclidean".
- bandwidth
(optional, numeric) Proportion of space at each side of the cost matrix diagonal (aka Sakoe-Chiba band) defining a valid region for dynamic time warping, used to control the flexibility of the warping path. This method prevents degenerate alignments due to differences in magnitude between time series when the data is not properly scaled. If
1(default), DTW is unconstrained. If0, DTW is fully constrained and the warping path follows the matrix diagonal. Recommended values may vary depending on the nature of the data. Ignored iflock_step = TRUE. Default: 1.- directional
(optional, logical) If TRUE, a directional time delay is computed as
x to yandy to x, resulting in two rows per pair of time series. Otherwise, the absolute magnitude of the delay betweenxandyis returned as a single row per pair. Default: TRUE
See also
Other distantia_support:
distantia_aggregate(),
distantia_boxplot(),
distantia_cluster_hclust(),
distantia_cluster_kmeans(),
distantia_matrix(),
distantia_model_frame(),
distantia_spatial(),
distantia_stats(),
utils_block_size(),
utils_cluster_hclust_optimizer(),
utils_cluster_kmeans_optimizer(),
utils_cluster_silhouette()
Examples
#load two long-term temperature time series
#local scaling to focus on shape rather than values
#polynomial detrending to make them stationary
tsl <- tsl_init(
x = cities_temperature[
cities_temperature$name %in% c("London", "Kinshasa"),
],
name = "name",
time = "time"
) |>
tsl_transform(
f = f_scale_local
) |>
tsl_transform(
f = f_detrend_poly,
degree = 35 #data years
)
if(interactive()){
tsl_plot(
tsl = tsl,
guide = FALSE
)
}
#compute shifts
df_shift <- distantia_time_delay(
tsl = tsl,
directional = TRUE
)
df_shift
#> x y distance units min q1 median modal mean q3
#> 1 Kinshasa London euclidean days -304 -215.00 -184 -212 -115.7031 89.75
#> 2 London Kinshasa euclidean days -243 -89.75 184 212 115.7031 215.00
#> max
#> 1 243
#> 2 304
