Version 2.0.3
Fixed
momentum_stats():stats::aggregate(x = df, by = importance ~ variable, ...)used a formula as thebyargument toaggregate.data.frame, which is an invalid interface. Changed to the formula interface:stats::aggregate(importance ~ variable, data = df, ...). Also addedna.rm = TRUEto theq1andq3summary functions, and added an up-front warning that names the count of excludedNAimportance values before they are filtered from the summary computation.Fixed
psi_null_dtw_cpp()insrc/psi.cpp: thebandwidthargument was forwarded to all permutedcost_path_cpp()calls but was silently omitted from the initial (observed)cost_path_cpp()call, causingpsi_null[0](the observed psi in the null distribution) to be computed without the Sakoe-Chiba constraint whenbandwidth < 1. The initial call now passesbandwidthconsistently.Fixed
zoo_aggregate():attributes(x)$nameused R’s$partial matching and could return column names instead ofNULLfor zoo objects lacking the custom"name"attribute. Changed toattr(x, "name", exact = TRUE)for consistency with the fix applied tozoo_name_get().Removed unused dead assignment
x_name <- attributes(x)$namefromzoo_name_set()(the computed value was never referenced).Fixed
distantia_aggregate()aggregate call:stats::aggregate(x = df, by = psi ~ x + y, ...)used a formula as thebyargument toaggregate.data.frame, which is an invalid interface and would error when multiple parameter combinations were present. Changed to the formula interface:stats::aggregate(psi ~ x + y, data = df, ...).Fixed
distantia_stats():q1andq3summary functions calledstats::quantile()withoutna.rm = TRUE, causing an error when anypsivalue wasNA. Addedna.rm = TRUEto both functions. Added an up-front warning that names the count of excludedNApsi values before they are filtered from the summary computation.Fixed
distantia_spatial(): when no column in thesfargument contained all time series names fromdf, the function crashed with a cryptic “argument is of length zero” error. Now emits an informative error message. Also made column selection deterministic by always using the first matching column when multiple columns satisfy the name-matching criterion.Fixed
zoo_name_get():attributes(x)$nameused R’s partial matching via the$operator and returnednames(x)(e.g., date strings from the zoo index) instead ofNULLfor zoo objects lacking the custom"name"attribute. Changed toattr(x, which = "name", exact = TRUE)to suppress partial matching.Fixed
zoo_vector_to_matrix(): the"name"attribute of the input zoo was not preserved in the output. The callzoo_name_set(x = y, name = y)passed the zoo object itself instead of the extracted character namex_name; the guardif(!is.character(name))inzoo_name_set()silently returnedyunchanged, dropping the name. Fixed by changingname = ytoname = x_name, and by replacing the internalzoo_name_get()call withattr(x, "name", exact = TRUE)to avoid the partial-matching issue.Fixed uninformative crash in
tsl_initialize()whenxis a list with fewer than 2 time series. A non-list code path inutils_prepare_time()was reached before the structural guard, producing the cryptic error'from' must be a finite number. Now an explicit early check fires with:argument 'x' must have at least 2 time series.Fixed
tsl_diagnose()callingutils_check_args_tsl(min_length = 1), which allowed single-element TSLs to pass validation. Changed tomin_length = 2to match the documented invariant (≥2 members required).Fixed
distantia_spatial()silently returning an invalid (zero-length) linestring when two time series share identical spatial coordinates. Now emits an informative warning naming the offending pair(s).Fixed divide-by-zero in
utils_rescale_vector()when all values ofxare identical (old_min == old_max). Previously returnedNaN; now returns a vector ofnew_minvalues.Fixed crash in
utils_cluster_silhouette()when all items belong to a single cluster (k = 1). Previously produced empty matrices and errors; now returnsNAsilhouette widths (data frame) orNA_real_(whenmean = TRUE).Fixed
NAbreaks inutils_color_breaks()whenn = 1. Previouslya[2]wasNA, corrupting all breaks; now returns a length-2 vector spanning the data range ± 0.5.Fixed missing row-count validation in
psi_distance_lock_step(): unequal-length series now produce an informative error before reaching C++.Fixed
distantia_stats()aggregate call:stats::aggregate(x = df, by = psi ~ name, ...)used a formula as thebyargument, which is not the valid interface foraggregate.data.frameand would error. Changed to the formula interface:stats::aggregate(psi ~ name, data = df, ...).distantia_dtw()now accepts abandwidthargument (Sakoe-Chiba constraint, default1= unconstrained) forwarded topsi_dtw_cpp(). The hardcodeddiagonal = TRUEsign convention is confirmed correct: psi = 0 for identical series.distantia_ls(): added missingdistanceandlock_stepcolumns to the return data frame so that the output matchesdistantia(lock_step = TRUE); sign convention (+1for lock-step viapsi_equation_cpp(..., diagonal = TRUE)) confirmed correct and covered by a new test.Fixed crash in
distantia()whenrepetitions = NULLwas passed explicitly (was coerced tological(0)in theifguard). Now treated asrepetitions = 0.distantia_time_delay(): removed spurious[1]subscript onq3assignment (line 223) to match all other stat assignments in the same loop (harmless but inconsistent).Fixed silent no-op renames in
distantia_spatial()(lines 259 & 268) wherecolnames(df)[colnames(df) == "name"]targeted the wrong column. The merge-producedidcolumn was never renamed toid_x/id_y; R’s auto-suffix fallback producedid.x/id.yinstead. Corrected target to"id"and updated all downstream references from dot-suffixed to underscore-suffixed names.psi_equation()now returnsNAinstead ofInf/NaNwhen the auto-sum denominator is zero (flat/constant time series).Fixed bug in
src/distance_methods.cppwherebray_curtisandsorensendistance methods used the wrong 3-character abbreviation prefix ("hel"instead of"bra"and"sor", respectively), causing an error when these distances were selected via their abbreviation inside C++.Fixed ambiguous usage of argument
seedinzoo_permute()as global/local variable within a %dofuture% loop. Thanks to@henrikbengtsson@mastodon.socialfor this patch!Fixed issue in
distantia_model_frame(), wherescale = TRUEwas not applied tocomposite_predictorsorgeographic_distance.Renamed argument
two_wayindistantia_time_delay()todirectionaland improved the documentation.
Version 2.0.2
CRAN release: 2025-02-01
Fixed error (r-devel only) in test file
tests/testthat/test-utils_new_time.RFunction
zoo_plot()now has the argumentguide_positionto modify the legend position.
Version 2.0.1
CRAN release: 2025-01-23
Fixed bug in function
cost_matrix_diagonal_weighted_cpp()where the additional weight of the diagonal movement was not being correctly applied. This change will result in slightly differentpsivalues indistantia(),distantia_dtw(), anddistantia_dtw_plot()whendiagonal = TRUE(default).Fixed bug in function
cost_path_cpp, which still produced diagonal cost matrices whendiagonal = FALSEbecauseweighted = TRUEturneddiagonaltoTRUE. Nowweightedis set toFALSEwhendiagonal = FALSE. This resulted in negative scores for orthogonal least-cost paths.All C++ functions returning values of type double to R functions now round their output to the 8th decimal. This should mitigate discrepancies between R and C++ functions due to differences in how these systems round floating point numbers.
Version 2.0.0
CRAN release: 2025-01-08
- This new version involves a massive rewrite that will break any previous code based on this package. To install the previous version (1.0.2):
#install from CRAN archive
remotes::install_version(
package = "distantia",
version = "1.0.2"
)
#install from archive branch in GitHub
remotes::install_github(
repo = "https://github.com/BlasBenito/distantia",
ref = "v1.0.2"
)-
Version 2.0.0 is a complete package rewrite from the ground up:
All core functions have been rewritten in C++ for increased speed and memory efficiency, and proper R wrappers for these functions are provided.
All functions and their arguments follow more modern naming conventions, and simplified interfaces to improve the user experience.
Most time series operations use the zoo library underneath, ensuring data consistency, computational speed, and memory efficiency.
Lists of zoo objects, named “time series lists” (“tsl” for short) throughout the package documentation, are used to organize time series data.
A complete toolset to manage time series lists is provided. All functions belonging are named using the prefix
tsl_...(). There are tools to generate, aggregate, resample, transform, plot, map, and analyze univariate or multivariate regular or irregular time series.Most functions taking time series lists as inputs are parallelized using the future package, and progress bars for parallelized operations are available as well via the progressr package.
New example datasets from different disciplines and functions to generate simulated time series are shipped with the package to improve the learning experience.
Version 1.0.3
Fixed bug in Hellinger distances and reworked the distance() function to make it slightly faster.
Version 1.0.2
CRAN release: 2019-10-28
Fixed an issue with the parallelization of tasks in the Windows platform. Now all parallelized functions modify their cluster settings depending on the user’s platform.
Version 1.0.1
CRAN release: 2019-08-06
Fixed a bug in the function workflowImportance. The argument ‘exclude.columns’ was being ignored.
Fixed the documentation of the functions workflowImportance and workflowSlotting. Their outputs were not well documented.
Fixed an error in workflowTransfer.
Changed how psi is computed. It’s now more respectful with the original formulation, and handles very similar sequences better.
Fixed the function workflowPsi to add +1 to the least cost produced by the options paired.samples = TRUE and diagonal = TRUE
Added the function workflowPsiHP, a High Performance version of workflowPsi. It has less options, but it is much faster, and has a much lower memory footprint.
