parse#

Contains functions to parse imaging data aquired on an OperaPhenix or Operetta into usable formats for downstream pipelines.

class scportrait.tools.parse._parse_phenix.PhenixParser(experiment_dir, flatfield_exported=True, export_symlinks=True, compress_rows=False, compress_cols=False)#

A class to parse and manage image data from Phenix experiments.

Parameters:
  • experiment_dir (str) – The directory containing the experiment data.

  • export_symlinks (bool) – Whether to use symbolic links for exported images.

  • flatfield_status (bool) – Whether flatfield images were exported.

  • compress_rows (bool) – Whether to compress rows in the parsed images.

  • compress_cols (bool) – Whether to compress columns in the parsed images.

  • xml_path (str) – The path to the XML file containing metadata.

  • image_dir (str) – The directory containing the input images.

  • channel_lookup (pd.DataFrame) – A DataFrame containing channel metadata.

  • metadata (pd.DataFrame or None) – A DataFrame containing parsed image metadata.

  • black_image (np.ndarray) – A black image used to replace missing images.

  • missing_images (list) – A list of missing image filenames.

  • copyfunction (function) – The function used to copy or link files.

copy_files(metadata)#

Copy files from the source directory to the output directory. The new file names are defined in the metadata.

Parameters:

metadata (pd.DataFrame) – Expected columns are: filename, new_file_name, source, dest

Return type:

None

parse()#

Complete parsing of phenix experiment including checking for and replacing missing images.

sort_wells(sort_tiles=False)#

Sorts parsed images according to their well.

Generates a folder tree where each well has its own folder containing all images from that well. If sort_tiles = True an additional layer will be added to the tree where all images obtained from the same FOV are sorted into a unique subfolder.

Parameters:

sort_tiles (bool, optional) – if the images should be sorted into individual directories according to FOV in addition to well, by default False

sort_timepoints(sort_wells=False)#

Sorts parsed images according to their timepoint.

Generates a folder tree where each timepoint has its own folder containing all images captured at that timepoint. If sort_wells = True an additional layer will be added to the tree where all images obtained from the same well are sorted into a unique subfolder according to timepoint.

Parameters:

sort_wells (bool, optional) – if the images should be sorted into individual directories according to well in addition to timepoint, by default False

class scportrait.tools.parse._parse_phenix.CombinedPhenixParser(experiment_dir, flatfield_exported=True, export_symlinks=True, compress_rows=False, compress_cols=False)#

Class to parse Phenix experiments where multiple experiments should be combined into one dataset. Usually this class is used if during image acquisition individual tiles were not imaged due to a focus failure. Instead of repeating the entire measurement, the missing tiles can be imaged in a separate experiment and then combined with the original dataset using this method.

This class inherits from the PhenixParser class and extends it by adding the functionality to combine multiple experiments into one dataset. These individual experiments need to be placed together in the following structure:

<experiment_name>/
└── experiments_to_combine/
    ├── experiment_1/
    ├── experiment_2/
    ├── experiment_3/
    └── ...
  • <experiment_name> can be chosen freely.

  • experiments_to_combine is a folder containing all experiments that should be combined. This folder should be placed in the main experiment directory.

  • experiment_n always refers to the complete folder as generated by Harmony when exporting phenix data without any further modifications.

The experiments will be combined in the order of their creation date and time. If two experiments contain images in the same position, the parser will keep the images from the first experiment.

Parameters:
  • experiment_dir (str) – The directory containing the experiment data.

  • export_symlinks (bool) – Whether to use symbolic links for exported images.

  • flatfield_status (bool) – Whether flatfield images were exported.

  • compress_rows (bool) – Whether to compress rows in the parsed images.

  • compress_cols (bool) – Whether to compress columns in the parsed images.

  • xml_path (str) – The path to the XML file containing metadata.

  • image_dir (str) – The directory containing the input images.

  • channel_lookup (pd.DataFrame) – A DataFrame containing channel metadata.

  • metadata (pd.DataFrame or None) – A DataFrame containing parsed image metadata.

  • black_image (np.ndarray) – A black image used to replace missing images.

  • missing_images (list) – A list of missing image filenames.

  • copyfunction (function) – The function used to copy or link files.