alphapepttools.metrics.pooled_coefficient_of_variation#
- alphapepttools.metrics.pooled_coefficient_of_variation(adata, group_key, *, min_valid=3, layer=None, inplace=True)#
Compute pooled coefficient of variation within sample groups.
The pooled coefficient of variation quantifies the variability of features across samples within technical defined groups. It is particularly useful for assessing the variability of a method in technical replicates across batches.
For each group \(g\) in a set of sample groups \(G\), the is calculated as the average coefficient of variation across all features \(f \in F\):
\[\text{PCV}_g = \frac{1}{|F|} \sum_{f \in F} \mathrm{CV}_g(f)\]where \(\mathrm{CV}_g(f)\) is the coefficient of variation of feature \(f\) within group \(g\).
In the original publication, the PCV was computed for each group and used to compare different normalization strategies. Lower PCV values with technical replicates indicate reduced intra-group variability and may suggest improved normalization.
- Parameters:
adata (
AnnData) – Annotated data matrixgroup_key (
str) – Column inadata.obsthat defines the sample groups to evaluate (e.g., biological replicates or batches)min_valid (
int(default:3)) – Minimal number of valid samples to compute CVlayer (
str|None(default:None)) – Layer for which the metric is computedinplace (
bool(default:True)) – IfTrue, the results are added toadata.uns['metrics']['pcv']. The object is changed in place. IfFalse, a DataFrame with the PCV values is returned
- Return type:
None|DataFrame- Returns:
If
inplace=True, modifies the inputadatawith PCV values stored inadata.uns["metrics"]["pcv"]. Ifinplace=False, returns a DataFrame containing PCV values per group
Examples
import numpy as np import anndata as ad import pandas as pd import alphapepttools as at # Create example data adata = ad.AnnData( X=np.array([[1, 2], [5, 1], [6, 6], [9, 3], [4, 8], [7, 4]]), obs=pd.DataFrame({"replicate_group": ["A", "A", "A", "B", "B", "B"]}), var=pd.DataFrame(index=["feature1", "feature2"]), ) # Compute PCV for technical replicates at.metrics.pooled_coefficient_of_variation(adata, group_key="replicate_group") # Access results print(adata.uns["metrics"]["pcv"]) # # {'A': 0.63, 'B': 0.37}
References
Chawade, A., Alexandersson, E. & Levander, F. Normalyzer: A Tool for Rapid Evaluation of Normalization Methods
for Omics Data Sets. J. Proteome Res. 13, 3114-3120 (2014).