Two-Sample Testing¶
Permutation-based hypothesis testing for the null hypothesis H₀: P = Q.
two_sample_test(samples_p, samples_q, *, method='mmd', n_permutations=1000, seed=None, low_memory=None, **kwargs)
¶
Two-sample hypothesis test via permutation.
Tests H0: P = Q against H1: P != Q by computing a test statistic and comparing it to a null distribution obtained by permuting the combined samples.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples_p
|
ndarray
|
Samples from distribution P. |
required |
samples_q
|
ndarray
|
Samples from distribution Q. |
required |
method
|
str
|
Test statistic to use:
|
'mmd'
|
n_permutations
|
int
|
Number of permutations for the null distribution (default 1000). Higher values give more precise p-values but take longer. |
1000
|
seed
|
int or None
|
Random seed for reproducibility. |
None
|
low_memory
|
bool or None
|
Memory strategy for
|
None
|
**kwargs
|
Any
|
Additional arguments passed to the test statistic function.
For |
{}
|
Returns:
| Type | Description |
|---|---|
TestResult
|
Named tuple with fields:
|
Notes
The p-value is computed as:
p = (1 + #{b : T_b >= T_obs}) / (1 + B)
where T_obs is the observed statistic, T_b are the null statistics, and B is the number of permutations. The +1 in numerator and denominator ensures the p-value is never exactly 0 and accounts for the observed statistic itself.
The permutation test is:
- Exact under H0 (finite-sample valid)
- Non-parametric (no distributional assumptions)
- Consistent against all alternatives (for MMD with characteristic kernel)
Examples:
>>> import numpy as np
>>> from divergence import two_sample_test
>>> rng = np.random.default_rng(42)
>>> p = rng.normal(0, 1, 200)
>>> q = rng.normal(1, 1, 200)
>>> result = two_sample_test(p, q, method="energy", n_permutations=500, seed=42)
>>> result.p_value < 0.05
True
References
.. [1] Gretton, A. et al. (2012). "A Kernel Two-Sample Test." JMLR, 13, 723-773. .. [2] Szekely, G. J. & Rizzo, M. L. (2004). "Testing for Equal Distributions in High Dimension." InterStat, 5.