alphapepttools.pl.scatter#
- alphapepttools.pl.scatter(data, x_column, y_column, color=None, color_map_column=None, color_column=None, ax=None, palette=None, color_dict=None, legend=None, scatter_kwargs=None, legend_kwargs=None, figure_kwargs=None, default_group='__data', xlim=None, ylim=None, order='color_frequency')#
Plot a scatterplot from a DataFrame or AnnData object
Coloring works in three ways, with the following order of precedence: 1. color_column, 2. color_map_column, 3. color. If a color_column is provided, its values are interpreted directly as colors, i.e. they have to be something matplotlib can understand (e.g. RGBA, hex, etc.). If a color_map_column is provided, its values are mapped to colors in combination with palette or color_dict (see color mapping logic below). If neither color_column nor color_map_column is provided, the color parameter is used to color all points the same (defaults to blue).
Color mapping logic#
- color_map_column is non-numeric:
If color_dict is not None: Use color_dict to assign levels of color_map_column to colors (unmapped levels default to grey).
If color_dict is None, and palette is not None: Use palette to automatically assign colors to each level.
If color_dict is None and palette is None: Use a repeating default palette to assign colors to each level.
- color_map_column is numeric:
If palette is a matplotlib colormap: Quantitatively map values to colors using the colormap. This means that e.g. 1 and 3 will be closer in color than 1 and 10.
If palette is not a matplotlib colormap: Treat numeric values as categorical and color as described above.
- type data:
AnnData|DataFrame- param data:
Data to plot, must contain the x_column and y_column and optionally the color_column or color_map_column.
- type x_column:
- param x_column:
Column in data to plot on the x-axis. Must contain numeric data.
- type y_column:
- param y_column:
Column in data to plot on the y-axis. Must contain numeric data.
- type color:
- param color:
Color to use for the scatterplot. By default “blue”.
- type color_map_column:
- param color_map_column:
Column in data to use for color encoding. These values are mapped to the palette or the color_dict (see below). Its values cannot contain NaNs, therefore color_map_column is coerced to string and missing values replaced by a default filler string. Overrides color parameter. By default None.
- type color_column:
- param color_column:
Column in data to plot the colors. This must contain actual color values (RGBA, hex, etc.). Overrides color and color_map_column parameters. By default None.
- type ax:
Axes|None(default:None)- param ax:
Matplotlib axes object to plot on, if None a new figure is created. By default None.
- type palette:
- param palette:
List of colors to use for color encoding, if None a default palette is used. Can be a matplotlib Colormap for continuous gradients. By default None.
- type color_dict:
- param color_dict:
Supercedes palette, a dictionary mapping levels to colors. By default None. If provided, palette is ignored.
- type legend:
- param legend:
Legend to add to the plot, by default None. If “auto”, a legend is created from the color_column. By default None.
- type scatter_kwargs:
- param scatter_kwargs:
Additional keyword arguments for the matplotlib scatter function (s, alpha, edgecolors, etc.). By default None.
- type legend_kwargs:
- param legend_kwargs:
Additional keyword arguments for the matplotlib legend function. By default None.
- type figure_kwargs:
- param figure_kwargs:
Additional keyword arguments for figure creation. By default None.
- type figure_kwargs:
dict | None, optional
- type xlim:
- param xlim:
Limits for the x-axis. By default None.
- type ylim:
- param ylim:
Limits for the y-axis. By default None.
- type order:
Literal['color_frequency','original'] (default:'color_frequency')- param order:
Ordering of plotting data points. If “color_frequency”, the rarest occuring colors are plotted on top. This is the default and follows the assumption that rarer categories are more important to the plot’s message (e.g. 1000 grey points should not cover 100 green points, which should not cover 10 red points). If “original”, the order of the data is kept as is, which is useful for plotting ordered categorical datapoints.
- type order:
str
- rtype:
- returns:
None
Examples
Simple scatter with single color:
import pandas as pd from alphapepttools.pl.figure import create_figure import alphapepttools as apt df = pd.DataFrame({"x": [1, 2, 3, 4, 5], "y": [2, 4, 1, 3, 5]}) fig, axm = create_figure(1, 1, figsize=(6, 4)) ax = axm.next() apt.pl.scatter(data=df, x_column="x", y_column="y", color="red", ax=ax)
Categorical coloring with automatic palette:
import pandas as pd from alphapepttools.pl.figure import create_figure import alphapepttools as apt df = pd.DataFrame( { "x": [1, 2, 3, 4, 5], "y": [2, 4, 1, 3, 5], "category": ["A", "B", "A", "C", "B"], } ) fig, axm = create_figure(1, 1, figsize=(6, 4)) ax = axm.next() apt.pl.scatter( data=df, x_column="x", y_column="y", color_map_column="category", legend="auto", ax=ax, )
Custom color dictionary:
import pandas as pd from alphapepttools.pl.figure import create_figure import alphapepttools as apt df = pd.DataFrame( { "x": [1, 2, 3, 4, 5], "y": [2, 4, 1, 3, 5], "significance": ["significant", "not_significant", "significant", "not_significant", "significant"], } ) fig, axm = create_figure(1, 1, figsize=(6, 4)) ax = axm.next() apt.pl.scatter( data=df, x_column="x", y_column="y", color_map_column="significance", color_dict={"significant": "red", "not_significant": "gray"}, legend="auto", scatter_kwargs={"s": 50, "alpha": 0.7}, ax=ax, )
Quantitative gradient with numeric data:
import pandas as pd from alphapepttools.pl.figure import create_figure import alphapepttools as apt from alphapepttools.pl.colors import BaseColormaps df = pd.DataFrame( { "x": [1, 2, 3, 4, 5], "y": [2, 4, 1, 3, 5], "intensity": [1.0, 5.0, 10.0, 15.0, 20.0], } ) fig, axm = create_figure(1, 1, figsize=(6, 4)) ax = axm.next() apt.pl.scatter( data=df, x_column="x", y_column="y", color_map_column="intensity", palette=BaseColormaps.get("sequential"), ax=ax, )
Direct color values from column:
import pandas as pd from alphapepttools.pl.figure import create_figure import alphapepttools as apt df = pd.DataFrame( { "x": [1, 2, 3, 4, 5], "y": [2, 4, 1, 3, 5], "my_colors": ["#FF0000", "#00FF00", "#0000FF", "#FFFF00", "#FF00FF"], } ) fig, axm = create_figure(1, 1, figsize=(6, 4)) ax = axm.next() apt.pl.scatter( data=df, x_column="x", y_column="y", color_column="my_colors", ax=ax, )
Notes
Points are ordered by color frequency (most frequent in back) for better visibility
Unmapped values in color_dict default to grey
NaN values in color columns are handled as strings