`distanceMatrix.Rd`

Computes distance matrices among the samples of two or more multivariate time-series provided in a single dataframe (generally produced by `prepareSequences`

), identified by a grouping column (argument `grouping.column`

). Distances can be computed with the methods "manhattan", "euclidean", "chi", and "hellinger", and are implemented in the function `distance`

. The function uses the packages `parallel`

, `foreach`

, and `doParallel`

to compute distances matrices among different sequences in parallel. It is configured to use all processors available minus one.

distanceMatrix( sequences = NULL, grouping.column = NULL, time.column = NULL, exclude.columns = NULL, method = "manhattan", parallel.execution = TRUE )

sequences | dataframe with multiple sequences identified by a grouping column. Generally the ouput of |
---|---|

grouping.column | character string, name of the column in |

time.column | character string, name of the column with time/depth/rank data. The data in this column is not modified. |

exclude.columns | character string or character vector with column names in |

method | character string naming a distance metric. Valid entries are: "manhattan", "euclidean", "chi", and "hellinger". Invalid entries will throw an error. |

parallel.execution | boolean, if |

A list with named slots containing the the distance matrices of every possible combination of sequences according to `grouping.column`

.

Distances are computed as:

`manhattan`

:`d <- sum(abs(x - y))`

`euclidean`

:`d <- sqrt(sum((x - y)^2))`

`chi`

:`xy <- x + y y. <- y / sum(y) x. <- x / sum(x) d <- sqrt(sum(((x. - y.)^2) / (xy / sum(xy))))`

`hellinger`

:`d <- sqrt(1/2 * sum(sqrt(x) - sqrt(y))^2)`

Note that zeroes are replaced by 0.00001 whem `method`

equals "chi" or "hellinger".

#loading data data(sequenceA) data(sequenceB) #preparing datasets AB.sequences <- prepareSequences( sequence.A = sequenceA, sequence.A.name = "A", sequence.B = sequenceB, sequence.B.name = "B", merge.mode = "complete", if.empty.cases = "zero", transformation = "hellinger" ) #computing distance matrix AB.distance.matrix <- distanceMatrix( sequences = AB.sequences, grouping.column = "id", method = "manhattan", parallel.execution = FALSE ) #plot plotMatrix(distance.matrix = AB.distance.matrix)