API#

Preprocessing#

pp.add_metadata(adata, incoming_metadata, ...)

Add metadata to an AnnData object while checking for matching indices or shape

pp.filter_by_metadata(adata, filter_dict, axis)

Filter based on metadata

pp.filter_data_completeness(adata, max_missing)

Filter features based on missing values.

pp.scale_and_center(adata[, scaler, layer, copy])

Scale and center data.

pp.nanlog(adata[, base, verbosity, layer, copy])

Logarithmize a data matrix.

pp.detect_special_values(data[, verbosity])

Detect special values such as NaN, zero, negative, and infinite values in the data.

pp.normalize(adata[, layer, strategy, ...])

Normalize measured counts per sample

pp.impute_gaussian(adata[, group_column, ...])

Impute missing values in each column by random sampling from a gaussian distribution.

pp.impute_median(adata[, group_column, ...])

Impute missing values using median imputation

pp.impute_knn(adata[, group_column, layer, ...])

Impute missing values using median imputation

pp.impute_bpca(adata, *[, n_components, ...])

Impute missing values using Bayesian Principal Component Analysis (BPCA)

pp.scanpy_pycombat(adata, batch[, layer, copy])

Correct batch effects using the ComBat method [].

pp.drop_singleton_batches(adata, batch)

Drop samples from batches that contain only a single sample.

Tools#

tl.get_id2gene_map(fasta_input[, source_type])

Reannotate protein groups with gene names from a FASTA input.

tl.map_genes_to_protein_groups(id2gene_map, ...)

Map gene names to protein groups using the provided id2gene_map mapping

tl.nan_safe_bh_correction(pvals)

Apply Benjamini-Hochberg correction with NaN-safe handling.

tl.nan_safe_ttest_ind(a, b[, min_valid_values])

NaN-safe wrapper around scipy.stats.ttest_ind.

tl.diff_exp_ttest(adata, between_column, ...)

Calculate ratios of features between two specific groups using t-test.

tl.diff_exp_alphaquant(adata, report, ...[, ...])

Calculate differential expression using AlphaQuant.

tl.diff_exp_ebayes(adata, between_column, ...)

Run Limma eBayes moderated ttest for differential expression.

tl.pca(adata[, layer, dim_space, ...])

Principal component analysis [].

tl.bpca(adata[, layer, dim_space, ...])

Bayesian Principal Component Analysis

tl.extract_pca_anndata(adata[, dim_space, ...])

Extract PCA/BPCA data required for plotting from an AnnData object.

tl.prepare_pca_1d_loadings_data_to_plot(...)

Prepare the gene loadings (1d) of a PC for plotting.

tl.prepare_pca_2d_loadings_data_to_plot(...)

Prepare a DataFrame with PCA feature loadings for the 2D plotting.

tl.prepare_scree_data_to_plot(adata, n_pcs, ...)

Prepare scree plot data from AnnData object.

Metrics#

metrics.coefficient_of_variation(adata, *[, ...])

Coefficient of variation

metrics.principal_component_regression(...)

Compute principal component regression (PCR) score.

metrics.pooled_coefficient_of_variation(...)

Compute pooled coefficient of variation within sample groups.

metrics.pooled_median_absolute_deviation(...)

Compute pooled median absolute deviation (PMAD) within sample groups.

metrics.calculate_qc_metrics(adata, *[, layer])

Calculate all QC metrics and add them to adata.obs.

metrics.fraction_complete(adata, *[, layer, ...])

Calculate the fraction of detected values per observation or per feature.

metrics.number_detected(adata, *[, layer, ...])

Count the number of detected features per observation or detected observations per feature.

metrics.total_intensity(adata, *[, layer, ...])

Calculate sum of intensity per observation or per feature.

Plotting#

pl.Plots(*args, **kwargs)

Removed.

pl.add_lines(ax, intercepts[, linetype, ...])

Add vertical or horizontal reference lines to a plot

pl.label_plot(ax, data, x_column, y_column, ...)

Add labels to a 2D axes object

pl.BaseColormaps()

Continuous colormaps for alphapepttools plots

pl.BaseColors()

Default color palette for alphapepttools plots

pl.BasePalettes()

Discrete color palettes for alphapepttools plots

pl.MappedColormaps(cmap[, percentile])

Percentile-based colormap normalization for outlier-robust visualization

pl.show_rgba_color_list(colors)

Display a horizontal bar of RGBA colors for visual inspection

pl.PlotConfig([data, _extra])

Base configuration for all plot types

pl.make_scatter_config(data, x_column, ...)

Create a config for scatter plots

pl.add_legend_to_axes(ax[, levels, legend, ...])

Flexibly add a legend to axes with automatic color assignment

pl.add_legend_to_axes_from_patches(ax, ...)

Add a legend with patches to an axes, using config defaults for font sizes

pl.create_figure([nrows, ncols, figsize, ...])

Create a figure with a specified number of rows and columns

pl.label_axes(ax[, xlabel, ylabel, title, ...])

Apply formatted labels and optional enumeration to a matplotlib axes object

pl.save_figure(fig, filename, output_dir[, ...])

Save a figure in a publication-friendly format

pl.get_color_mapping(values, palette)

Map categorical values to colors

pl.layered_plot(ax, base_config[, layers, ...])

Plot multiple layers with defined hierarchy and without datapoint reuse.

pl.histogram(data, value_column[, ...])

Plot a histogram from a DataFrame or AnnData object

pl.scatter(data, x_column, y_column[, ...])

Plot a scatterplot from a DataFrame or AnnData object

pl.barplot(ax, data[, grouping_column, ...])

Plot a bar chart from a DataFrame or AnnData object

pl.boxplot(ax, data[, grouping_column, ...])

Plot a box plot from a DataFrame or AnnData object

pl.violinplot(ax, data[, grouping_column, ...])

Plot a violin plot from a DataFrame or AnnData object

pl.rank_median_plot(data, ax[, layer, ...])

Rank plot showing median intensities across samples.

pl.plot_pca(data[, x_column, y_column, ...])

PCA scatter plot showing principal component projections.

pl.scree_plot(adata, ax[, n_pcs, dim_space, ...])

Scree plot showing explained variance for each principal component.

pl.plot_pca_loadings(data, ax[, dim_space, ...])

1D loadings plot showing top features contributing to a principal component.

pl.plot_pca_loadings_2d(data, ax[, ...])

2D loadings plot showing top features contributing to two principal components.

pl.volcano(data[, x_column, y_column, ax, ...])

Create a volcano plot for differential expression visualization

IO#

Reader functions#

io.read_psm_table(file_paths, search_engine)

Read peptide spectrum match tables to the anndata.AnnData format

io.read_pg_table(path, search_engine, *[, ...])

Read protein group table to the anndata.AnnData format

io.AnnDataFactory(psm_df, intensity_column, ...)

Factory class to convert AlphaBase PSM DataFrames to AnnData format.

io.list_available_reader([kind])

Get a list of all available readers, as provided by alphabase

Data#

Example data that can be accessed with the package.

data.available_data()

List all available proteomics studies

data.get_data(study[, output_dir])

Download data from a specific study