Multivariate Dependence¶
Measures of statistical dependence among multiple variables, generalizing pairwise mutual information.
total_correlation(samples, *, base=np.e, discrete=False, estimator='knn')
¶
Compute the total correlation (multi-information) of a multivariate sample.
.. math::
TC(X_1, \ldots, X_d) = \sum_{i=1}^{d} H(X_i) - H(X_1, \ldots, X_d)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
ndarray
|
Sample array of shape |
required |
base
|
float
|
Logarithm base. Default is |
e
|
discrete
|
bool
|
If True, use discrete estimators. Default is False. |
False
|
estimator
|
str
|
Estimator for continuous data: |
'knn'
|
Returns:
| Type | Description |
|---|---|
float
|
Total correlation, non-negative. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
normalized_mutual_information(samples_x, samples_y, *, normalization='geometric', base=np.e, discrete=False)
¶
Compute normalized mutual information between two variables.
.. math::
\mathrm{NMI}(X, Y) = \frac{I(X; Y)}{\mathrm{norm}(H(X), H(Y))}
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples_x
|
ndarray
|
Samples of variable X, shape |
required |
samples_y
|
ndarray
|
Samples of variable Y, shape |
required |
normalization
|
str or list of str
|
Normalization method: |
'geometric'
|
base
|
float
|
Logarithm base. Default is |
e
|
discrete
|
bool
|
If True, use discrete estimators. Default is False. |
False
|
Returns:
| Type | Description |
|---|---|
float or dict[str, float]
|
Normalized mutual information. Returns a float when
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If any requested normalization is unknown. |
variation_of_information(samples_x, samples_y, *, base=np.e, discrete=False)
¶
Compute the variation of information between two variables.
.. math::
VI(X, Y) = H(X) + H(Y) - 2\,I(X; Y)
This is a true metric on the space of clusterings/partitions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples_x
|
ndarray
|
Samples of variable X, shape |
required |
samples_y
|
ndarray
|
Samples of variable Y, shape |
required |
base
|
float
|
Logarithm base. Default is |
e
|
discrete
|
bool
|
If True, use discrete estimators. Default is False. |
False
|
Returns:
| Type | Description |
|---|---|
float
|
Variation of information, non-negative. |