alphapepttools.pl.boxplot

Contents

alphapepttools.pl.boxplot#

alphapepttools.pl.boxplot(ax, data, grouping_column=None, value_column=None, direct_columns=None, color=(np.float64(0.21299500192233756), np.float64(0.5114186851211072), np.float64(0.730795847750865), np.float64(1.0)), color_dict=None)#

Plot a box plot from a DataFrame or AnnData object

Creates a box plot showing the distribution of values for grouped data. Each box shows the median, quartiles, and outliers for values within a group. Boxes have semi-transparent fill with opaque black outlines, medians, whiskers, and caps.

Two modes of operation: 1. Grouping mode: Use grouping_column/value_column to group data by categories 2. Direct mode: Use direct_columns to compare multiple columns directly

Parameters:
  • ax (Axes) – Matplotlib axes object to plot on.

  • data (AnnData | DataFrame) – Data containing grouping and value columns or direct columns to plot.

  • grouping_column (list[str] | None (default: None)) – Column containing the groups to compare (categorical). Used with value_column for grouped comparisons. By default None.

  • value_column (list[str] | None (default: None)) – Column whose values should be plotted (numeric). Used with grouping_column for grouped comparisons. By default None.

  • direct_columns (list[str] | None (default: None)) – List of column names to compare directly. Each column becomes a separate box. Overrides grouping_column and value_column. By default None.

  • color (tuple (default: (np.float64(0.21299500192233756), np.float64(0.5114186851211072), np.float64(0.730795847750865), np.float64(1.0)))) – Default color for all boxes. By default BaseColors.get(“blue”).

  • color_dict (dict | None (default: None)) – Dictionary mapping group labels to specific colors. Overrides the color parameter for specified groups. By default None.

Return type:

None

Returns:

None

Examples

Grouped comparison (long format):

import pandas as pd
import anndata as ad
from alphapepttools.pl.figure import create_figure
import alphapepttools as apt

data = pd.DataFrame({"intensity": [1, 2, 3, 4, 5, 6, 7]})
obs = pd.DataFrame({"group": ["A", "A", "B", "B", "B", "C", "C"]})
adata = ad.AnnData(X=data.values, obs=obs, var=pd.DataFrame(index=data.columns))

fig, axm = create_figure(1, 1, figsize=(6, 4))
ax = axm.next()
apt.pl.boxplot(
    ax=ax,
    data=adata,
    grouping_column="group",
    value_column="intensity",
    color_dict={"A": "red", "B": "green", "C": "blue"},
)

Direct column comparison (wide format):

import pandas as pd
import anndata as ad
from alphapepttools.pl.figure import create_figure
import alphapepttools as apt

data = pd.DataFrame({"protein1": [1, 2, 3], "protein2": [4, 5, 6], "protein3": [7, 8, 9]})
adata = ad.AnnData(X=data.values, var=pd.DataFrame(index=data.columns))

fig, axm = create_figure(1, 1, figsize=(6, 4))
ax = axm.next()
apt.pl.boxplot(
    ax=ax,
    data=adata,
    direct_columns=["protein1", "protein2", "protein3"],
)

Notes

  • Boxes show median (center line), quartiles (box edges), and outliers (points)

  • Whiskers extend to 1.5 * IQR or the most extreme non-outlier point

  • Boxes have 50% transparency with opaque black outlines

  • When using direct_columns, each column’s distribution is shown separately

  • Missing values (NaN) are excluded from the distribution calculations