Computes the sum of distances between consecutive samples in a multivariate time-series. Required to compute the measure of dissimilarity
psi (Birks and Gordon 1985). Distances can be computed through the methods "manhattan", "euclidean", "chi", and "hellinger", and are implemented in the function
autoSum( sequences = NULL, least.cost.path = NULL, time.column = NULL, grouping.column = NULL, exclude.columns = NULL, method = "manhattan", parallel.execution = TRUE )
dataframe with one or several multivariate time-series identified by a grouping column.
character string, name of the column with time/depth/rank data. The data in this column is not modified.
character string, name of the column in
character string or character vector with column names in
character string naming a distance metric. Valid entries are: "manhattan", "euclidean", "chi", and "hellinger". Invalid entries will throw an error.
A list with slots named according
grouping.column if there are several sequences in
sequences or a number if there is only one sequence.
Distances are computed as:
d <- sum(abs(x - y))
d <- sqrt(sum((x - y)^2))
xy <- x + y
y. <- y / sum(y)
x. <- x / sum(x)
d <- sqrt(sum(((x. - y.)^2) / (xy / sum(xy))))
d <- sqrt(1/2 * sum(sqrt(x) - sqrt(y))^2)
Note that zeroes are replaced by 0.00001 whem
method equals "chi" or "hellinger".
#loading data data(sequenceA) data(sequenceB) #preparing datasets AB.sequences <- prepareSequences( sequence.A = sequenceA, sequence.A.name = "A", sequence.B = sequenceB, sequence.B.name = "B", merge.mode = "complete", if.empty.cases = "zero", transformation = "hellinger" ) #computing distance matrix AB.distance.matrix <- distanceMatrix( sequences = AB.sequences, grouping.column = "id", method = "manhattan", parallel.execution = FALSE ) #computing least cost matrix AB.least.cost.matrix <- leastCostMatrix( distance.matrix = AB.distance.matrix, diagonal = FALSE, parallel.execution = FALSE ) AB.least.cost.path <- leastCostPath( distance.matrix = AB.distance.matrix, least.cost.matrix = AB.least.cost.matrix, parallel.execution = FALSE ) #autosum AB.autosum <- autoSum( sequences = AB.sequences, least.cost.path = AB.least.cost.path, grouping.column = "id", parallel.execution = FALSE ) AB.autosum#> $`A|B` #>  46.86205 #>