Modules

parse

Contains functions to parse imaging data into a usable formats for downstream pipelines.

sparcstools.parse.parse_phenix(phenix_dir, flatfield_exported=True, WGAbackground=False, export_meta=True, export_as_symlink=False)

Function to automatically rename TIFS exported from Harmony into a format where row and well ID as well as Tile position are indicated in the file name. Example of an exported file name: “Timepoint{#}_Row{#}_Well{#}_{channel}_zstack{#}_r{#}_c{#}.tif”

Parameters
  • phenix_dir – Path indicating the exported harmony files to parse.

  • flatfield_exported (bool) – boolean indicating if the data was exported from harmony with or without flatfield correction.

  • WGAbackground – export second copy of WGA stains for background correction to improve segmentation. If set to False not performed. Else enter value of the channel that should be copied and contains the WGA stain.

  • export_meta – boolean value indicating if a metadata file containing, tile positions, exact time of measurement etc. should be written out.

  • export_as_symlink – boolean value indicating if the parsed files should be copied or symlinked. If set to true can lead to issues when accessing remote filesystems from differentoperating systems

sparcstools.parse.parse_phenix_40X_slide(phenix_dir, flatfield_exported=True, downsampled=False, WGAbackground=False, export_meta=True, export_as_symlink=False)

Function to automatically rename TIFS exported from Harmony into a format where row and well ID as well as Tile position are indicated in the file name. Example of an exported file name: “Timepoint{#}_Row{#}_Well{#}_{channel}_zstack{#}_r{#}_c{#}.tif”

Parameters
  • phenix_dir – Path indicating the exported harmony files to parse.

  • flatfield_exported (bool) – boolean indicating if the data was exported from harmony with or without flatfield correction.

  • WGAbackground – export second copy of WGA stains for background correction to improve segmentation. If set to False not performed. Else enter value of the channel that should be copied and contains the WGA stain.

  • export_meta – boolean value indicating if a metadata file containing, tile positions, exact time of measurement etc. should be written out.

  • export_as_symlink – boolean value indicating if the parsed files should be copied or symlinked. If set to true can lead to issues when accessing remote filesystems from differentoperating systems

sparcstools.parse.sort_timepoints(parsed_dir, use_symlink=False)

Additionally sort generated timecourse images according to well and tile position. Function generates a new folder called timecourse_sorted which contains a unqiue folder for each unique tile position containing all imaging data (i.e. zstacks, timepoints, channels) of that tile. This function is meant for quick sorting of generated images for simple import of e.g. timecourse experiments into FIJI.

Parameters
  • parsed_dir – filepath to parsed images folder generated with the function parse_phenix.

  • use_symlonks (bool) – boolean value indicating if the images should be copied as symlinks or as regular files. Symlinks can potentially cause issues if using the data on different OS but is signficiantly faster and does not produce as much data overhead.

sparcstools.parse.sort_wells(parsed_dir, use_symlink=False, assign_random_id=False)

Sort acquired phenix images into unique folders for each well.

Parameters
  • parsed_dir – filepath to parsed images folder generated with the function parse_phenix.

  • use_symlink – boolean value indicating if the images should be copied as symlinks to their new destination

  • assign_random_id – boolean value indicating if the images in the sorted wells folder should be prepended with a random id.

stitch

Collection of functions to perform stitching of parsed image Tiffs.

sparcstools.stitch.generate_stitched(input_dir, slidename, pattern, outdir, overlap=0.1, max_shift=30, stitching_channel='Alexa488', crop={'bottom': 0, 'left': 0, 'right': 0, 'top': 0}, plot_QC=True, filetype=['.tif'], WGAchannel=None, do_intensity_rescale=True, rescale_range=(1, 99), no_rescale_channel=None, export_XML=True, return_tile_positions=True, channel_order=None, filter_sigma=0)

Function to generate a stitched image.

Parameters
  • input_dir (str) – Path to the folder containing exported TIF files named with the following nameing convention: “Row{#}_Well{#}_{channel}_zstack{#}_r{#}_c{#}.tif”. These images can be generated for example by running the sparcstools.parse.parse_phenix() function.

  • slidename (str) – string indicating the slidename that is added to the stitched images generated

  • pattern (str) – Regex string to identify the naming pattern of the TIFs that should be stitched together. For example: “Row1_Well2_{channel}_zstack3_r{row:03}_c{col:03}.tif”. All values in {} indicate those which are matched by regex to find all matching tifs.

  • outdir (str) – path indicating where the stitched images should be written out

  • overlap (float between 0 and 1) – value between 0 and 1 indicating the degree of overlap that was used while recording data at the microscope.

  • max_shift (int) – value indicating the maximum threshold for tile shifts. Default value in ashlar is 15. In general this parameter does not need to be adjusted but it is provided to give more control.

  • stitching_channel (str) – string indicating the channel name on which the stitching should be calculated. the positions for each tile calculated in this channel will be passed to the other channels.

  • crop – dictionary of the form {‘top’:0, ‘bottom’:0, ‘left’:0, ‘right’:0} indicating how many pixels (based on a generated thumbnail, see sparcstools.stitch.generate_thumbnail) should be cropped from the final image in each indicated dimension. Leave this set to default if no cropping should be performed.

  • plot_QC (bool) – boolean value indicating if QC plots should be generated

  • filetype ([str]) – list containing any of [“.tif”, “.ome.zarr”, “.ome.tif”] defining to which type of file the stiched results should be written. If more than one element is present in the list all export types will be generated in the same output directory.

  • WGAchannel (str) – string indicating the name of the WGA channel in case an illumination correction should be performed on this channel

  • do_intensity_rescale (bool | "partial" | "full_image") – boolean value indicating if the rescale_p1_P99 function should be applied to individual tiles before stitching or not. Alternatively this parameter can alos be set to partial which applies the rescale function to all channels except those specied in no_rescale_channel. Finally this parameter can also be set to “full image” which does not apply a rescaling tile wise but instead to the completely assembled image after stitching on a per channel basis. This ensures that all channels are scaled to the same range.

  • rescale_range ((lower, upper) | dict({channel: (lower, upper)})) – tuple indicating the lower and upper percentile to use for percentile rescaling. Default is (1, 99) which means that the 1st and 99th percentile are used for rescaling. Alternatively a dictionary can be passed with custom values per channel.

  • no_rescale_channel (None | [str]) – either None or a list of channel strings on which no rescaling before stitching should be performed.

  • export_XML – boolean value. If true then an xml is exported when writing to .tif which allows for the import into BIAS.

  • return_tile_positions (bool | default = True) – boolean value. If true and return_type != “return_array” the tile positions are written out to csv.

  • channel_order (None | [str]) – if None do nothing, if list of channel names is supplied the channels are remapped into the specified order

sparcstools.stitch.generate_thumbnail(input_dir, pattern, outdir, overlap, name, stitching_channel='DAPI', export_examples=False, do_intensity_rescale=True, rescale_range=(1, 99), scale=0.05)

Function to generate a scaled down thumbnail of stitched image. Can be used for example to get a low resolution overview of the scanned region to select areas for exporting high resolution stitched images.

Parameters
  • input_dir (str) – Path to the folder containing exported TIF files named with the following nameing convention: “Row{#}_Well{#}_{channel}_zstack{#}_r{#}_c{#}.tif”. These images can be generated for example by running the sparcstools.parse.parse_phenix() function.

  • pattern (str) – Regex string to identify the naming pattern of the TIFs that should be stitched together. For example: “Row1_Well2_{channel}_zstack3_r{row:03}_c{col:03}.tif”. All values in {} indicate those which are matched by regex to find all matching tifs.

  • outdir – path indicating where the stitched images should be written out

  • overlap – value between 0 and 1 indicating the degree of overlap that was used while recording data at the microscope.

  • name – string indicating the slidename that is added to the stitched images generated

  • export_examples – boolean value indicating if individual example tiles should be exported in addition to performing thumbnail generation.

  • do_intensity_rescale – boolean value indicating if the rescale_p1_P99 function should be applied before stitching or not.

sparcstools.stitch.prepare_stitch_slurm_job(path, stitching_channel='mCherry', zstack_value=1, rescale_range=(0.1, 99.9), overlap=0.1, jobs_per_file=24)

Function to generate all required output to execute an arrayed batch job to stitch all wells contained within a Harmony directory.

Once run navigate to the generated folder (slurm_jobs/stitch_all/logs) in the main harmony project directory and run “sbatch ../run.sh”.

The sbatch array automatically limits the maximum number of running jobs at a time to 20.

Parameters
  • path (str) – Folder containing the exported Harmony output generated with SPARCStools.

  • stitching_channel (str, optional) – String indicating which channel should be stitched on. Defaults to “mCherry”.

  • zstack_value (int, optional) – Integer indicating which zstack level stitching should be performed on. Defaults to 1.

  • rescale_range (tuple, optional) – Percentage range for rescaling images before stitching them. Defaults to (0.1, 99.9).

  • overlap (float, optional) – Tile overlap as fraction. Defaults to 0.1.

  • jobs_per_file (int, optional) – How many stitching executions should be executed per slurm job. Too few jobs will generate too much overhead and become inefficient. Defaults to 24.

image processing

Contains functions to perform standard image processing steps, e.g. downsampling.

sparcstools.image_processing.downsample_folder(folder_path, num_threads=20, file_ending=('.tif', '.tiff'), N=2)

Multi-Threaded Function to downsample image equivalent to 2x2 binning. Overwrites original images! Do not run multiple times. Output is saved as uint16.

Parameters
  • folder_path (str) – string indicating the folder containing all the image files that should be downsampled

  • num_threads (int) – number of threads for multithreading

  • file_ending (str | (str, str)) – string or tuple of strings indicating which file ending the script should filter for in the indicated folder

  • N (int) – number of pixels that should be binned together

sparcstools.image_processing.downsample_folder_copy_images(folder_path, outdir, num_threads=20, file_ending=('.tif', '.tiff'), N=2)

Multi-Threaded Function to downsample image equivalent to 2x2 binning. Duplicates images before downsampling! Do not run multiple times. Output is saved as uint16.

Parameters
  • folder_path (str) – string indicating the folder containing all the image files that should be downsampled

  • outdir (str) – string indicating the folder where the downsampled images should be generated

  • num_threads (int) – number of threads for multithreading

  • file_ending (str | (str, str)) – string or tuple of strings indicating which file ending the script should filter for in the indicated folder

  • N (int) – number of pixels that should be binned together

sparcstools.image_processing.downsample_img(img_path, N=2, copy=False, outdir=None)

Function to downsample a single image equivalent to NxN binning using the mean between pixels. Overwrites the original image(!), do not run multiple times on the same image.

Parameters
  • img_path (str) – string indicating the file path to the .tif file which should be downsampled.

  • N (int, default = 2) – number of pixels that should be binned together using mean between pixels