io#
memory mapped file handling#
- scportrait.io.daskmmap.dask_array_from_path(file_path: str, container_name: str = 'array') Array #
Create a Dask array from a HDF5 file, supporting both contiguous and chunked datasets.
- Parameters:
file_path – Path pointing to the HDF5 file
container_name – Name of the dataset in the HDF5 file
- Returns:
Dask array representing the dataset
- scportrait.io.daskmmap.calculate_chunk_sizes(shape: tuple[int, ...], dtype: dtype | str, target_size_gb: int = 5) tuple[int, ...] #
Calculate chunk sizes that result in chunks of approximately the target size in GB.
- Parameters:
shape – Shape of the array
dtype – Data type of the array
target_size_gb – Target size of each chunk in gigabytes
- Returns:
Calculated chunk sizes for the Dask array
- scportrait.io.daskmmap.calculate_chunk_sizes_chunks(shape: tuple[int, ...], dtype: dtype | str, HDF5_chunk_size: tuple[int, ...], target_size_gb: int = 5) tuple[int, ...] #
Calculate chunk sizes that result in chunks of approximately the target size in GB.
- Parameters:
shape – Shape of the array
dtype – Data type of the array
HDF5_chunk_size – Chunk sizes of the existing HDF5 data container
target_size_gb – Target size of each chunk in gigabytes
- Returns:
Calculated chunk sizes for the Dask array
- scportrait.io.daskmmap.mmap_dask_array_contigious(filename: str, shape: tuple[int, ...], dtype: dtype | str, offset: int = 0, chunks: tuple[int, ...] = (5,)) Array #
Create a Dask array from raw binary data in filename by memory mapping.
- Parameters:
filename – Path to the raw binary data file
shape – Shape of the array
dtype – Data type of the array
offset – Offset in bytes from the beginning of the file
chunks – Chunk sizes for the Dask array
- Returns:
Dask array that is memory-mapped to disk
- scportrait.io.daskmmap.mmap_dask_array_chunked(filename: str, shape: tuple[int, ...], dtype: dtype | str, container_name: str, chunks: tuple[int, ...] = (5,)) Array #
Create a Dask array from raw binary data in filename by memory mapping.
- Parameters:
filename – Path to the raw binary data file
shape – Shape of the array
dtype – Data type of the array
container_name – Name of the dataset in the HDF5 file
chunks – Chunk sizes for the Dask array
- Returns:
Dask array that is memory-mapped to disk
- scportrait.io.daskmmap.load_hdf5_contigious(filename: str, shape: tuple[int, ...], dtype: dtype | str, offset: int, slices: tuple[slice, ...]) ndarray #
Memory map the given file with overall shape and dtype and return a slice.
- Parameters:
filename – Path to the raw binary data file
shape – Shape of the array
dtype – Data type of the array
offset – Offset in bytes from the beginning of the file
slices – Tuple of slices specifying the chunk to load
- Returns:
The sliced chunk from the memory-mapped array
- scportrait.io.daskmmap.load_hdf5_chunk(file_path: str, container_name: str, slices: tuple[slice, ...]) ndarray #
Load a chunk of data from a chunked HDF5 dataset.
- Parameters:
file_path – Path to the HDF5 file
container_name – Name of the dataset in the HDF5 file
slices – Tuple of slices specifying the chunk to load
- Returns:
The sliced chunk from the HDF5 dataset
file readers#
- scportrait.io.read.read_ome_zarr(path: str, magnification: str = '0', array: ndarray[Any, dtype[_ScalarType_co]] | None = None) ndarray[Any, dtype[_ScalarType_co]] | None #
Reads an OME-Zarr file from a given path.
- Parameters:
path – Path to the OME-Zarr file
magnification – Magnification level to be read
array – Optional numpy array to store the image data. If None, returns a new array
- Returns:
The image data as a numpy array if array is None, otherwise None after updating the provided array
Example
>>> image = read_ome_zarr("path/to/file.zarr") >>> # Or with existing array: >>> existing_array = np.zeros((100, 100)) >>> read_ome_zarr("path/to/file.zarr", array=existing_array)