alphapepttools.metrics.pooled_coefficient_of_variation#

alphapepttools.metrics.pooled_coefficient_of_variation(adata, group_column, *, min_valid=3, layer=None, inplace=True)#

Compute pooled coefficient of variation within sample groups.

The pooled coefficient of variation quantifies the variability of features across samples within technical defined groups. It is particularly useful for assessing the variability of a method in technical replicates across batches.

For each group \(g\) in a set of sample groups \(G\), the is calculated as the average coefficient of variation across all features \(f \in F\):

\[\text{PCV}_g = \frac{1}{|F|} \sum_{f \in F} \mathrm{CV}_g(f)\]

where \(\mathrm{CV}_g(f)\) is the coefficient of variation of feature \(f\) within group \(g\).

In the original publication, the PCV was computed for each group and used to compare different normalization strategies. Lower PCV values with technical replicates indicate reduced intra-group variability and may suggest improved normalization.

Parameters:

adata (AnnData) – Annotated data matrix
group_column (str) – Column in adata.obs that defines the sample groups to evaluate (e.g., biological replicates or batches)
min_valid (int (default: 3)) – Minimal number of valid samples to compute CV
layer (str | None (default: None)) – Layer for which the metric is computed
inplace (bool (default: True)) – If True, the results are added to adata.uns['metrics']['pcv']. The object is changed in place. If False, a DataFrame with the PCV values is returned

Return type:

None | DataFrame

Returns:

If inplace=True, modifies the input adata with PCV values stored in adata.uns["metrics"]["pcv"]. If inplace=False, returns a DataFrame containing PCV values per group

Examples

import numpy as np
import anndata as ad
import pandas as pd
import alphapepttools as apt

# Create example data
adata = ad.AnnData(
    X=np.array([[1, 2], [5, 1], [6, 6], [9, 3], [4, 8], [7, 4]]),
    obs=pd.DataFrame({"replicate_group": ["A", "A", "A", "B", "B", "B"]}),
    var=pd.DataFrame(index=["feature1", "feature2"]),
)

# Compute PCV for technical replicates
apt.metrics.pooled_coefficient_of_variation(adata, group_column="replicate_group")

# Access results
print(adata.uns["metrics"]["pcv"])
# # {'A': 0.63, 'B': 0.37}

alphapepttools.metrics.pooled_coefficient_of_variation

Contents

alphapepttools.metrics.pooled_coefficient_of_variation#