alphapepttools.pp.filter_data_completeness#

alphapepttools.pp.filter_data_completeness(adata, max_missing, group_column=None, groups=None, action='flag', var_colname='passed_threshold_missing_values')#

Filter features based on missing values

Filters AnnData features (columns) based on the fraction of missing values. If group_column and groups are provided, only missingness of certain metadata levels is considered. This is especially useful for imbalanced classes, where filtering by global missingness may leave too many missing values in the smaller class.

(In case rows should be filtered, it is recommended to transpose the adata object prior to calling this function and reverting the transpose afterwards.)

Parameters:

max_missing (float) – Maximum fraction of missing values allowed. Compared with the fraction of missing values in a “greater than” fashion, i.e. if max_missing is 0.6 and the fraction of missing values is 0.6, the sample or feature is kept. Greater than comparison is used here since the missing fraction may be 0.0, which equals filtering for 100 % data completeness.
group_column (str, optional) – Column in obs to determine groups for filtering.
groups (list[str], optional) – List of levels of the group_column to consider in filtering. E.g. if the column has the levels [‘A’, ‘B’, ‘C’], and groups = [‘A’, ‘B’], only missingness of features in these groups is considered. If None, all groups are considered.
action (str, optional) – Action to perform. can be ‘flag’ (default) or ‘drop’. If ‘flag’, a boolean column in adata.var is added to indicate whether the feature passed the missingness threshold. If ‘drop’, features that do not pass the threshold are dropped from the AnnData object.
var_colname (str, optional) – Name of the adata.var boolean column to add if action is ‘flag’. Default is ‘passed_threshold_missing_values’.

Return type:

AnnData

Returns:

AnnData AnnData object with either a new adata.var column added (if flag) or filtered features (if drop).

alphapepttools.pp.filter_data_completeness

Contents

alphapepttools.pp.filter_data_completeness#