alphapepttools.pl.volcano#

alphapepttools.pl.volcano(data, x_column='log2fc', y_column='-log10(p_value)', ax=None, layers=None, color_dict=None, x_thresholds=(-1, 1), y_thresholds=np.float64(1.3010299956639813), label_layers=None, display_id_column=None, max_labels=None, x_label_anchors=None, y_display_start=1, y_padding_factor=4, xlims=None, ylims=None, scatter_kwargs=None, line_kwargs=None, label_kwargs=None, legend=None, legend_kwargs=None, default_color=(0.8274509803921568, 0.8274509803921568, 0.8274509803921568, 1.0), default_group='data')#

Create a volcano plot for differential expression visualization

Volcano plots visualize differential expression results by plotting fold change (x-axis) against statistical significance (y-axis). This function creates layered scatter plots with threshold lines and optional point labeling.

Parameters:

data (AnnData | DataFrame) – Data containing expression values and statistics
x_column (str (default: 'log2fc')) – Column name for x-axis values (typically log fold change)
y_column (str (default: '-log10(p_value)')) – Column name for y-axis values (typically -log10 p-value)
ax (Axes | None (default: None)) – Axes to plot on. If None, creates new figure
layers (list[tuple] | None (default: None)) –
List of layer specifications for hierarchical plotting. Each tuple contains (column_name, value(s), color_key[, scatter_kwargs]). Points are plotted in reverse order (first layer on top). Example: [(“gene_type”, “housekeeping”, “hk_color”),

(“significance”, “significant”, “sig_color”, {“s”: 100})]
color_dict (dict[str, str | tuple] | None (default: None)) – Maps color keys from layers to actual colors. Example: {“hk_color”: “blue”, “sig_color”: “red”}
x_thresholds (float | tuple (default: (-1, 1))) – X-axis values for vertical threshold lines. Default (-1, 1) for fold change cutoffs
y_thresholds (float | tuple (default: np.float64(1.3010299956639813))) – Y-axis values for horizontal threshold lines. Default (-log10(0.05),) for p-value cutoff
label_layers (list[str] | None (default: None)) – Color keys of layers to label. Only points in these layers will have text labels added
display_id_column (str | None (default: None)) – Column containing labels to display. If None, uses data index
max_labels (int | None (default: None)) – Maximum number of labels to show. Labels are prioritized by y-value
x_label_anchors (list[float] | None (default: None)) – X-positions to anchor labels to (for alignment). If None, labels appear at data point positions
y_display_start (float (default: 1)) – Starting y-position for stacked labels (1=top, 0=bottom). Default 1
y_padding_factor (float (default: 4)) – Vertical spacing multiplier between stacked labels. Default 4
xlims (tuple[float, float] | None (default: None)) – X-axis limits. If None, calculated from data with padding
ylims (tuple[float, float] | None (default: None)) – Y-axis limits. If None, calculated from data with padding
scatter_kwargs (dict | None (default: None)) – Additional arguments passed to scatter plot (e.g., {“s”: 50, “alpha”: 0.5})
line_kwargs (dict | None (default: None)) – Additional arguments for threshold lines (e.g., {“linewidth”: 2, “linestyle”: “–“})
label_kwargs (dict | None (default: None)) – Additional arguments for axis labels
legend (str | Legend | None (default: None)) – Legend specification. If “auto”, creates legend from color_dict
legend_kwargs (dict | None (default: None)) – Additional arguments for legend
default_color (str | tuple (default: (0.8274509803921568, 0.8274509803921568, 0.8274509803921568, 1.0))) – Color for points not matching any layer. Default grey
default_group (str (default: 'data')) – Name for the default layer containing unassigned points. Default “data”

Return type:

None

Returns:

None

See also

layered_plot: Core layering functionality
add_lines: Add threshold lines
label_plot: Add text labels

Notes

The layering system ensures each point appears in exactly one layer. Points are assigned to the first matching layer in the list. Unassigned points go to the default layer (plotted in background).

Examples

Create a volcano plot with differential expression data:

import numpy as np
import pandas as pd
import alphapepttools as apt
from alphapepttools.pl import BaseColors

# Generate example differential expression data
rng = np.random.default_rng(seed=42)
testx = rng.normal(0, 1, 300)
testy = -np.cos(testx) + rng.normal(0, 0.2, 300)
testp = 10 ** -(testy - min(testy))

data = pd.DataFrame(
    {
        "id": [f"P{10000 + i}" for i in range(300)],
        "gene": [f"gene_{i}" for i in range(300)],
        "log2fc": testx,
        "pval": testp,
        "neg_log10pval": -np.log10(testp),
    }
)
data.index = data["id"]

# Add differential expression status
data["diff_exp_status"] = data["log2fc"].apply(
    lambda x: "upregulated" if x > 1 else ("downregulated" if x < -1 else "unchanged")
)

# Mark first 10 genes as proteins of interest
data["label"] = "other"
data.loc[data.index[:10], "label"] = "POI"

# Define specific proteins to highlight
pois = ["P10291", "P10292", "P10293", "P10294", "P10295"]

# Define visualization layers (plotted in reverse order)
plot_layers = [
    ("id", pois, "POI_hypothesis"),  # Specific hypothesis proteins on top
    ("label", "POI", "POI"),  # General POI proteins
    ("diff_exp_status", "upregulated", "upregulated"),  # Upregulated
    ("diff_exp_status", "downregulated", "downregulated"),  # Downregulated
    ("diff_exp_status", "unchanged", "unchanged"),  # Background
]

# Define colors for each layer
color_dict = {
    "upregulated": BaseColors.get("orange"),
    "downregulated": BaseColors.get("blue"),
    "unchanged": BaseColors.get("grey"),
    "POI": "black",
    "POI_hypothesis": BaseColors.get("purple", lighten=0.7),
}

# Specify which layers to label
label_layers = ["POI", "POI_hypothesis"]

# Create volcano plot
apt.pl.volcano(
    data=data,
    x_column="log2fc",
    y_column="neg_log10pval",
    color_dict=color_dict,
    layers=plot_layers,
    label_layers=label_layers,
    x_label_anchors=[-3.5, 3.5],  # Anchor labels to left/right
    y_padding_factor=1.7,  # Vertical spacing between labels
    y_display_start=0.75,  # Start labels at 75% from bottom
    xlims=(-6, 6),
)

This creates a volcano plot where:

Background points (unchanged) appear in grey
Differentially expressed genes are colored orange (up) or blue (down)
Proteins of interest (POI) are highlighted in black
Specific hypothesis proteins are emphasized in purple on top
Only POI and hypothesis proteins receive text labels

alphapepttools.pl.volcano

Contents

alphapepttools.pl.volcano#