processing#

images#

scportrait.processing.images.downsample_img_padding(img: ndarray, N: int = 2) ndarray#

Downsample image by a factor of N by taking every Nth pixel. Before downsampling this function will pad the image to ensure its compatible with the selected kernel size.

Parameters:
  • img – image to be downsampled

  • N – factor by which the image should be downsampled

Returns:

downsampled image

Example: >>> img = np.random.rand(11, 11) >>> downsampled_img = downsample_img_padding(img)

scportrait.processing.images.percentile_normalization(im: ndarray, lower_percentile: float = 0.001, upper_percentile: float = 0.999, return_copy: bool = True) ndarray#

Normalize an input image channel-wise based on defined percentiles.

The percentiles will be calculated, and the image will be normalized to [0, 1] based on the lower and upper percentile.

Parameters:
  • im – Numpy array of shape (height, width) or (channels, height, width).

  • lower_percentile – Lower percentile used for normalization, all lower values will be clipped to 0. Defaults to 0.001.

  • upper_percentile – Upper percentile used for normalization, all higher values will be clipped to 1. Defaults to 0.999.

Returns:

Normalized Numpy array with dtype == float

Return type:

im (np.array)

Example: >>> img = np.random.rand(3, 4, 4) # (channels, height, width) >>> norm_img = percentile_normalization(img, 0.001, 0.999)

scportrait.processing.images.EDF(image)#

Calculate Extended Depth of Field for the given input image Z-stack. Based on implementation here: https://mahotas.readthedocs.io/en/latest/edf.html#id3

Parameters:

image (np.array) – Input image array of shape (Z, X, Y)

Returns:

EDF selected image

Return type:

np.array

scportrait.processing.images.maximum_intensity_projection(image)#

Calculate Extended Depth of Field for the given input image Zstack.

Parameters:

image (np.array) – Input image array of shape (Z, X, Y)

Returns:

Maximum Intensity Projected Image.

Return type:

np.array

masks#

class scportrait.processing.masks.MatchNucleusCytosolIds(filtering_threshold=0.5, downsampling_factor=None, erosion_dilation=True, smoothing_kernel_size=7, directory=None, *args, **kwargs)#

Filter class for matching nucleusIDs to their matching cytosol IDs and removing all classes from the given segmentation masks that do not fullfill the filtering criteria.

Masks only pass filtering if both a nucleus and a cytosol mask are present and have an overlapping area larger than the specified threshold. If the threshold is not specified, the default value is set to 0.5.

Parameters:
  • filtering_threshold (float, optional) – The threshold for filtering cytosol IDs based on the proportion of overlapping area with the nucleus. Default is 0.5.

  • downsampling_factor (int, optional) – The downsampling factor for the masks. Default is None.

  • erosion_dilation (bool, optional) – Flag indicating whether to perform erosion and dilation on the masks during upscaling. Default is True.

  • smoothing_kernel_size (int, optional) – The size of the smoothing kernel for upscaling. Default is 7.

  • *args – Additional positional arguments.

  • **kwargs – Additional keyword arguments.

filtering_threshold#

The threshold for filtering cytosol IDs based on the proportion of overlapping area with the nucleus.

Type:

float

downsample#

Flag indicating whether downsampling is enabled.

Type:

bool

downsampling_factor#

The downsampling factor for the masks.

Type:

int

erosion_dilation#

Flag indicating whether to perform erosion and dilation on the masks during upscaling.

Type:

bool

smoothing_kernel_size#

The size of the smoothing kernel for upscaling.

Type:

int

nucleus_mask#

The nucleus mask.

Type:

numpy.ndarray

cytosol_mask#

The cytosol mask.

Type:

numpy.ndarray

nuclei_discard_list#

A list of nucleus IDs to be discarded.

Type:

list

cytosol_discard_list#

A list of cytosol IDs to be discarded.

Type:

list

nucleus_lookup_dict#

A dictionary mapping nucleus IDs to matched cytosol IDs after filtering.

Type:

dict

load_masks(nucleus_mask, cytosol_mask)#

Load the nucleus and cytosol masks.

update_cytosol_mask(cytosol_mask)#

Update the cytosol mask based on the matched nucleus-cytosol pairs.

update_masks()#

Update the nucleus and cytosol masks after filtering.

match_nucleus_id(nucleus_id)#

Match the given nucleus ID to a cytosol ID based on the overlapping area.

initialize_lookup_table()#

Initialize the lookup table by matching all nucleus IDs to cytosol IDs.

count_cytosol_occurances()#

Count the occurrences of each cytosol ID in the lookup table.

check_for_unassigned_cytosols()#

Check for unassigned cytosol IDs in the cytosol mask.

identify_multinucleated_cells()#

Identify and discard multinucleated cells from the lookup table.

cleanup_filtering_lists()#

Cleanup the discard lists by removing duplicate entries.

cleanup_lookup_dictionary()#

Cleanup the lookup dictionary by removing discarded nucleus-cytosol pairs.

generate_lookup_table(nucleus_mask, cytosol_mask)#

Generate the lookup table by performing all necessary steps.

filter(nucleus_mask, cytosol_mask)#

Filter the nucleus and cytosol masks based on the matching results.

get_lookup_table(nucleus_mask, cytosol_mask)#

Generate the lookup table by performing all necessary steps.

Parameters:
  • nucleus_mask (numpy.ndarray) – The nucleus mask.

  • cytosol_mask (numpy.ndarray) – The cytosol mask.

Returns:

The lookup table mapping nucleus IDs to matched cytosol IDs.

Return type:

dict

filter(nucleus_mask, cytosol_mask)#

Filter the nucleus and cytosol masks based on the matching results and return the updated masks.

Parameters:
  • nucleus_mask (numpy.ndarray) – The nucleus mask.

  • cytosol_mask (numpy.ndarray) – The cytosol mask.

Returns:

A tuple containing the updated nucleus mask and cytosol mask after filtering.

Return type:

tuple

class scportrait.processing.masks.SizeFilter(filter_threshold=None, label='segmask', log=True, plot_qc=True, directory=None, confidence_interval=0.95, n_components=1, population_to_keep='largest', filter_lower=True, filter_upper=True, downsampling_factor=None, erosion_dilation=True, smoothing_kernel_size=7, *args, **kwargs)#

Filter class for removing objects from a mask based on their size.

This class provides methods to remove objects from a segmentation mask based on their size. If specified the objects are filtered using a threshold range passed by the user. Otherwise, this threshold range will be automatically calculated.

To automatically calculate the threshold range, a gaussian mixture model will be fitted to the data. Per default, the number of components is set to 2, as it is assumed that the objects in the mask can be divided into two groups: small and large objects. The small objects constitute segmentation artefacts (partial masks that are frequently generated by segmentation models like e.g. cellpose) while the large objects represent the actual cell masks of interest. Using the fitted model, the filtering thresholds are calculated to remove all cells that fall outside of the given confidence interval.

Parameters:
  • filter_threshold (tuple of floats, optional) – The lower and upper thresholds for object size filtering. If not provided, it will be automatically calculated.

  • label (str, optional) – The label of the mask. Default is “segmask”.

  • log (bool, optional) – Whether to take the logarithm of the size of the objects before fitting the normal distribution. Default is True. By enabling this option, the filter will better be able to distinguish between small and large objects.

  • plot_qc (bool, optional) – Whether to plot quality control figures. Default is True.

  • directory (str, optional) – The directory to save the generated figures. If not provided, the current working directory will be used.

  • confidence_interval (float, optional) – The confidence interval for calculating the filtering threshold. Default is 0.95.

  • n_components (int, optional) – The number of components in the Gaussian mixture model. Default is 1.

  • population_to_keep (str, optional) – For multipopulation models this parameter determines which population should be kept. Options are “largest”, “smallest”, “mostcommon”, “leastcommon”. Default is “mostcommon”. If set to “largest” or “smallest”, the model is chosen which has the largest or smallest mean. If set to “mostcommon” or “leastcommon”, the model is chosen whose population is least or most common.

  • filter_lower (bool, optional) – Whether to filter objects that are smaller than the lower threshold. Default is True.

  • filter_upper (bool, optional) – Whether to filter objects that are larger than the upper threshold. Default is True.

  • *args – Additional positional arguments.

  • **kwargs – Additional keyword arguments.

Examples

>>> # Create a SizeFilter object
>>> filter = SizeFilter(filter_threshold=(100, 200), label="my_mask")
>>> # Apply the filter to a mask
>>> filtered_mask = filter.filter(input_mask)
>>> # Get the object IDs to be removed
>>> ids_to_remove = filter.get_ids_to_remove(input_mask)
>>> # Update the mask by removing the identified object IDs
>>> updated_mask = filter.update_mask(input_mask, ids_to_remove)
load_mask(mask)#

Load the mask to be filtered.

Parameters:

mask (numpy.ndarray) – The mask to be filtered.

filter(input_mask)#

Filter the input mask based on the filtering threshold.

Parameters:

input_mask (ndarray) – The input mask to be filtered. Expected shape is (X, Y)

Returns:

filtered_mask – The filtered mask after settings the IDs which do not fullfill the filtering criteria to 0.

Return type:

ndarray