project#

At the core of scPortrait is the concept of a Project. A Project is a Python class that orchestrates all scPortrait processing steps, serving as the central element for all operations. Each Project corresponds to a directory on the file system, which houses the input data for a specific scPortrait run along with the generated outputs. The choice of the appropriate Project class depends on the structure of the data to be processed.

For more details, refer to here.

class scportrait.pipeline.project.Project(project_location: str, config_path: str, segmentation_f=None, extraction_f=None, featurization_f=None, selection_f=None, overwrite: bool = False, debug: bool = False)#

Base implementation for a scPortrait project.

This class is designed to handle single-timepoint, single-location data, like e.g. whole-slide images.

Segmentation Methods should be based on Segmentation or ShardedSegmentation. Extraction Methods should be based on HDF5CellExtraction.

config#

Dictionary containing the config file.

Type:

dict

nuc_seg_name#

Name of the nucleus segmentation object.

Type:

str

cyto_seg_name#

Name of the cytosol segmentation object.

Type:

str

sdata_path#

Path to the spatialdata object.

Type:

str

filehander#

Filehandler for the spatialdata object which manages all calls or updates to the spatialdata object.

Type:

sdata_filehandler

DEFAULT_IMAGE_DTYPE#

alias of uint16

DEFAULT_SEGMENTATION_DTYPE#

alias of uint32

DEFAULT_SINGLE_CELL_IMAGE_DTYPE#

alias of float16

update_featurization_f(featurization_f)#

Update the featurization method chosen for the project without reinitializing the entire project.

Parameters:

featurization_f – The featurization method that should be used for the project.

Returns:

the featurization method is updated in the project object.

Return type:

None

Examples

Update the featurization method for a project:

from scportrait.pipeline.featurization import CellFeaturizer

project.update_featurization_f(CellFeaturizer)
print_project_status()#

Print the current project status.

view_sdata()#

Start an interactive napari viewer to look at the sdata object associated with the project. .. note:: This only works in sessions with a visual interface.

load_input_from_array(array: ndarray, channel_names: list[str] | None = None, overwrite: bool | None = None, remap: list[int] | None = None) None#

Load input image from a numpy array.

In the array the channels should be specified in the following order: nucleus, cytosol other channels.

Parameters:
  • array (np.ndarray) – Input image as a numpy array.

  • channel_names – List of channel names. Default is ["channel_0", "channel_1", ...].

  • overwrite (bool, None, optional) – If set to None, will read the overwrite value from the associated project. Otherwise can be set to a boolean value to override project specific settings for image loading.

  • remap – List of integers that can be used to shuffle the order of the channels. For example [1, 0, 2] to invert the first two channels. Default is None in which case no reordering is performed. This transform is also applied to the channel names.

Returns:

Image is written to the project associated sdata object.

The input image can be accessed using the project object:

project.input_image

Return type:

None

Examples

Load input images from tif files and attach them to an scportrait project:

from scportrait.pipeline.project import Project

project = Project("path/to/project", config_path="path/to/config.yml", overwrite=True, debug=False)
array = np.random.rand(3, 1000, 1000)
channel_names = ["cytosol", "nucleus", "other_channel"]
project.load_input_from_array(array, channel_names=channel_names, remap=[1, 0, 2])
load_input_from_tif_files(file_paths: list[str], channel_names: list[str] | None = None, crop: list[tuple[int, int]] | None = None, overwrite: bool | None = None, remap: list[int] | None = None, cache: str | None = None)#

Load input image from a list of files. The channels need to be specified in the following order: nucleus, cytosol other channels.

Parameters:
  • file_paths – List containing paths to each channel tiff file, like ["path1/img.tiff", "path2/img.tiff", "path3/img.tiff"]

  • channel_names – List of channel names. Default is ["channel_0", "channel_1", ...].

  • crop (None, List[Tuple], optional) – When set, it can be used to crop the input image. The first element refers to the first dimension of the image and so on. For example use [(0,1000),(0,2000)] to crop the image to 1000 px height and 2000 px width from the top left corner.

  • overwrite (bool, None, optional) – If set to None, will read the overwrite value from the associated project. Otherwise can be set to a boolean value to override project specific settings for image loading.

  • remap – List of integers that can be used to shuffle the order of the channels. For example [1, 0, 2] to invert the first two channels. Default is None in which case no reordering is performed. This transform is also applied to the channel names.

  • cache – path to a directory where the temporary files should be stored. Default is None then the current working directory will be used.

Returns:

Image is written to the project associated sdata object.

The input image can be accessed using the project object:

project.input_image

Return type:

None

Examples

Load input images from tif files and attach them to an scportrait project:

from scportrait.data._datasets import dataset_3
from scportrait.pipeline.project import Project

project = Project("path/to/project", config_path="path/to/config.yml", overwrite=True, debug=False)
path = dataset_3()
image_paths = [
    f"{path}/Ch2.tif",
    f"{path}/Ch1.tif",
    f"{path}/Ch3.tif",
]
channel_names = ["cytosol", "nucleus", "other_channel"]
project.load_input_from_tif_files(image_paths, channel_names=channel_names, remap=[1, 0, 2])
load_input_from_omezarr(ome_zarr_path: str, overwrite: bool | None = None, channel_names: None | list[str] = None, remap: list[int] | None = None) None#

Load input image from an ome-zarr file.

Parameters:
  • ome_zarr_path – Path to the ome-zarr file.

  • overwrite (bool, None, optional) – If set to None, will read the overwrite value from the associated project. Otherwise can be set to a boolean value to override project specific settings for image loading.

  • remap – List of integers that can be used to shuffle the order of the channels. For example [1, 0, 2] to invert the first two channels. Default is None in which case no reordering is performed. This transform is also applied to the channel names.

Returns:

Image is written to the project associated sdata object.

The input image can be accessed using the project object:

project.input_image

Return type:

None

Examples

Load input images from an ome-zarr file and attach them to an scportrait project:

from scportrait.pipeline.project import Project

project = Project("path/to/project", config_path="path/to/config.yml", overwrite=True, debug=False)
ome_zarr_path = "path/to/ome.zarr"
project.load_input_from_omezarr(ome_zarr_path, remap=[1, 0, 2])
load_input_from_sdata(sdata_path, input_image_name='input_image', nucleus_segmentation_name=None, cytosol_segmentation_name=None, overwrite=None)#

Load input image from a spatialdata object.

complete_segmentation(overwrite: bool | None = None)#

If a sharded Segmentation was run but individual tiles failed to segment properly, this method can be called to repeat the segmentation on the failed tiles only. Already calculated segmentation masks will not be recalculated.

select(cell_sets: list[dict], calibration_marker: ndarray | None = None, segmentation_name: str = 'seg_all_nucleus', name: str | None = None)#

Select specified classes using the defined selection method.