Computes the sum of distances between consecutive samples in a multivariate time-series. Required to compute the measure of dissimilarity psi (Birks and Gordon 1985). Distances can be computed through the methods "manhattan", "euclidean", "chi", and "hellinger", and are implemented in the function distance.

autoSum(
  sequences = NULL,
  least.cost.path = NULL,
  time.column = NULL,
  grouping.column = NULL,
  exclude.columns = NULL,
  method = "manhattan",
  parallel.execution = TRUE
  )

Arguments

sequences

dataframe with one or several multivariate time-series identified by a grouping column.

least.cost.path

a list usually resulting from either leastCostPath or leastCostPathNoBlocks.

time.column

character string, name of the column with time/depth/rank data. The data in this column is not modified.

grouping.column

character string, name of the column in sequences to be used to identify separates sequences within the file. This argument is ignored if sequence.A and sequence.B are provided.

exclude.columns

character string or character vector with column names in sequences, or squence.A and sequence.B to be excluded from the analysis.

method

character string naming a distance metric. Valid entries are: "manhattan", "euclidean", "chi", and "hellinger". Invalid entries will throw an error.

parallel.execution

boolean, if TRUE (default), execution is parallelized, and serialized if FALSE.

Value

A list with slots named according grouping.column if there are several sequences in sequences or a number if there is only one sequence.

Details

Distances are computed as:

  • manhattan: d <- sum(abs(x - y))

  • euclidean: d <- sqrt(sum((x - y)^2))

  • chi: xy <- x + y y. <- y / sum(y) x. <- x / sum(x) d <- sqrt(sum(((x. - y.)^2) / (xy / sum(xy))))

  • hellinger: d <- sqrt(1/2 * sum(sqrt(x) - sqrt(y))^2)

Note that zeroes are replaced by 0.00001 whem method equals "chi" or "hellinger".

See also

Examples

#loading data data(sequenceA) data(sequenceB) #preparing datasets AB.sequences <- prepareSequences( sequence.A = sequenceA, sequence.A.name = "A", sequence.B = sequenceB, sequence.B.name = "B", merge.mode = "complete", if.empty.cases = "zero", transformation = "hellinger" ) #computing distance matrix AB.distance.matrix <- distanceMatrix( sequences = AB.sequences, grouping.column = "id", method = "manhattan", parallel.execution = FALSE ) #computing least cost matrix AB.least.cost.matrix <- leastCostMatrix( distance.matrix = AB.distance.matrix, diagonal = FALSE, parallel.execution = FALSE ) AB.least.cost.path <- leastCostPath( distance.matrix = AB.distance.matrix, least.cost.matrix = AB.least.cost.matrix, parallel.execution = FALSE ) #autosum AB.autosum <- autoSum( sequences = AB.sequences, least.cost.path = AB.least.cost.path, grouping.column = "id", parallel.execution = FALSE ) AB.autosum
#> $`A|B` #> [1] 46.86205 #>