alphapepttools.pp.scanpy_pycombat

alphapepttools.pp.scanpy_pycombat#

alphapepttools.pp.scanpy_pycombat(adata, batch, layer=None, *, copy=False)#

Correct batch effects using the ComBat method [].

Applies empirical Bayes batch correction to remove systematic non-biological variation associated with a batch variable in adata.obs. The input data must be free of NaN values. Remove features with missing values or run an appropriate imputation method before calling this function.

Uses scanpy.pp.combat() under the hood ([]).

Parameters:
  • adata (AnnData) – Annotated data matrix, where rows are cells and columns are features. The data matrix cannot contain NaN values

  • batch (str) – Name of the batch feature in obs, the variation associated with this feature will be corrected. Missing values in this column will be replaced by one single “NA” batch

  • layer (str | None (default: None)) – Name of the layer to batch correct. If None (default), the attribute adata.X is used

  • copy (bool (default: False)) – Whether to return a modified copy (True) of the anndata object. If False (default) modifies the object inplace

Return type:

AnnData

Returns:

AnnData with batch correction applied to layer. If copy=False modifies the anndata object at layer inplace and returns None. If copy=True, returns a modified copy

Examples

import anndata as ad
import pandas as pd
import numpy as np
import alphapepttools as at

# Create example data with batch effects
np.random.seed(0)
batch1_data = np.random.randn(3, 4) + 1  # Batch 1 with offset
batch2_data = np.random.randn(3, 4) - 1  # Batch 2 with offset
X = np.vstack([batch1_data, batch2_data])

adata = ad.AnnData(
    X=X,
    obs=pd.DataFrame({"batch": ["B1", "B1", "B1", "B2", "B2", "B2"]}),
    var=pd.DataFrame(index=["protein1", "protein2", "protein3", "protein4"]),
)

# Apply batch correction
at.pp.scanpy_pycombat(adata, batch="batch")
print("Batch correction applied to adata.X")

References