skbio.stats.distance.DissimilarityMatrix#
- class skbio.stats.distance.DissimilarityMatrix(data, ids=None, validate=True)[source]#
Store dissimilarities between objects.
A DissimilarityMatrix instance stores a square, hollow, two-dimensional matrix of dissimilarities between objects. Objects could be, for example, samples or DNA sequences. A sequence of IDs accompanies the dissimilarities.
Methods are provided to load and save dissimilarity matrices from/to disk, as well as perform common operations such as extracting dissimilarities based on object ID.
- Parameters:
- dataarray_like or DissimilarityMatrix
Square, hollow, two-dimensional
numpy.ndarrayof dissimilarities (floats), or a structure that can be converted to anumpy.ndarrayusingnumpy.asarrayor a one-dimensional vector of dissimilarities (floats), as defined by scipy.spatial.distance.squareform. Can instead be a DissimilarityMatrix (or subclass) instance, in which case the instance’s data will be used. Data will be converted to a floatdtypeif necessary. A copy will not be made if already anumpy.ndarraywith a floatdtype.- idssequence of str, optional
Sequence of strings to be used as object IDs. Must match the number of rows/cols in data. If
None(the default), IDs will be monotonically-increasing integers cast as strings, with numbering starting from zero, e.g.,('0', '1', '2', '3', ...).- validatebool, optional
If validate is
True(the default) and data is not a DissimilarityMatrix object, the input data will be validated.
See also
DistanceMatrixscipy.spatial.distance.squareform
Notes
The dissimilarities are stored in redundant (square-form) format [1].
The data are not checked for symmetry, nor guaranteed/assumed to be symmetric.
References
Attributes
TTranspose of the dissimilarity matrix.
dataArray of dissimilarities.
default_write_formatdtypeData type of the dissimilarities.
idsTuple of object IDs.
pngGet figure data in PNG format.
shapeTwo-element tuple containing the dissimilarity matrix dimensions.
sizeTotal number of elements in the dissimilarity matrix.
svgGet figure data in SVG format.
Built-ins
__contains__(lookup_id)Check if the specified ID is in the dissimilarity matrix.
__eq__(other)Compare this dissimilarity matrix to another for equality.
__ge__(value, /)Return self>=value.
__getitem__(index)Slice into dissimilarity data by object ID or numpy indexing.
__getstate__(/)Helper for pickle.
__gt__(value, /)Return self>value.
__le__(value, /)Return self<=value.
__lt__(value, /)Return self<value.
__ne__(other)Determine whether two dissimilarity matrices are not equal.
__str__()Return a string representation of the dissimilarity matrix.
Methods
between(from_, to_[, allow_overlap])Obtain the distances between the two groups of IDs.
copy()Return a deep copy of the dissimilarity matrix.
filter(ids[, strict])Filter the dissimilarity matrix by IDs.
from_iterable(iterable, metric[, key, keys])Create DissimilarityMatrix from an iterable given a metric.
index(lookup_id)Return the index of the specified ID.
plot([cmap, title])Create a heatmap of the dissimilarity matrix.
read(file[, format])Create a new
DissimilarityMatrixinstance from a file.Return an array of dissimilarities in redundant format.
rename(mapper[, strict])Rename IDs in the dissimilarity matrix.
Create a
pandas.DataFramefrom thisDissimilarityMatrix.Return the transpose of the dissimilarity matrix.
within(ids)Obtain all the distances among the set of IDs.
write(file[, format])Write an instance of
DissimilarityMatrixto a file.