alphapepttools.metrics.coefficient_of_variation

alphapepttools.metrics.coefficient_of_variation#

alphapepttools.metrics.coefficient_of_variation(adata, *, min_valid=3, key_added='cv', layer=None, copy=False)#

Coefficient of variation

Compute the coefficient of variation (CV) for all features.

\[CV = \frac{s(X)}{\hat{X}}\]

with the empirical standard deviation \(s(X)\) of feature \(X\) and the empirical mean \(\hat{X}\)

The coefficient of variation is a scale-invariant measure of dispersion that enables comparison of variability across features with different abundance levels.

Within technical replicates, the CV indicates measurement reproducibility. Lower CVs indicate good technical precision, while high CVs suggest issues with sample preparation, instrument performance, or quantification accuracy.

Between different biological samples, CVs reflect both biological and technical variation. Higher CVs are expected and can indicate genuine biological heterogeneity.

Parameters:
  • adata (AnnData) – AnnData object

  • min_valid (int (default: 3)) – Minimum number of samples required to estimate the CV. Will be set to NaN otherwise.

  • key_added (str (default: 'cv')) – Name of column added to adata.var

  • layer (Optional[str] (default: None)) – Name of the layer to compute metric on. If None (default), the data matrix X is used.

  • copy (bool (default: False)) – Whether to return a modified copy (True) of the anndata object. If False (default) modifies the object inplace

Return type:

None | AnnData

Returns:

None | anndata.AnnData AnnData object with computed CVs added to adata.var[key_added]. If copy=False modifies the anndata object at layer inplace and returns None. If copy=True, returns a modified copy.

Notes

The CV only considers non-missing values and should be computed before imputation. Features with fewer than min_valid non-missing values will return NaN for CV.