alphapepttools.pp.impute_median#
- alphapepttools.pp.impute_median(adata, group_column=None, *, layer=None, copy=True)#
Impute missing values using median imputation
Replace missing (NaN) values in the data matrix with the median of non-missing values for each feature. Can perform global imputation using all samples or group-wise imputation using subsets of samples defined by a categorical variable.
- Parameters:
adata (
AnnData) – AnnData objectlayer (
Optional[str] (default:None)) – Layer to use for imputationgroup_column (
Optional[str] (default:None)) – Column name inadata.obsdefining groups for group-wise imputation. IfNone(default), computes median across all samples. Defines a group column that is used to subset the samples that should be used for imputation. If specified, computes median separately for each group and imputes missing values using the group-specific median. Ifgroup_columncontains NaNs, the respective observations are ignored.copy (
bool(default:True)) – Whether to return a modified copy (True) of the anndata object. If False (default) modifies the object inplace
- Return type:
- Returns:
ad.AnnDataCopy of anndata object with modified layer- Raises:
Notes
Features that are fully missing will not be imputed. Appropriate filtering of features with
at.pp.filter_data_completeness()is critical.Example
Impute the values in the
.Xmatrixadata = at.pp.impute_median(adata) assert np.sum(np.isnan(adata.X)) == 0
Impute data in a specific layer
adata = at.pp.impute_median(adata, layer="layer2") assert np.sum(np.isnan(adata.layers["layer2"])) == 0
Impute groupwise based on a categorical column:
adata = at.pp.impute_median(adata, group_column="cell_type") # Imputes group-wise medians