3.2. Generate cutting XML from exported results from QUPath

The stitched images were loaded into Qupath and specific regions annotaed by hand. calibration points were also selected and labelled with calib1, calib2 and calib3.

QuPath manual shape annotation

The annotated shapes were then exported to geojson file.

Geojson export

3.2.1. Import libraries and define helper functions

[1]:

# import required libraries
import json
import geojson
import geopandas
import pandas as pd
import numpy as np

from lmd.lib import Collection, Shape

/Users/sophia/mambaforge/envs/pylmd_docs/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

[2]:

#define helper functions

def get_calib_points(list_of_calibpoint_names, df):
    #create shape list
    pointlist = []
    for point_name in list_of_calibpoint_names:
        pointlist.append(df.loc[df['name'] == point_name, 'geometry'].values[0])

    #create coordinate list
    listarray = []
    for point in pointlist:
        listarray.append([point.x, point.y])
    nparray = np.array(listarray)

    return(nparray)

#returns dataframe with only polygon objects
def remove_non_polygons(df):
    import shapely

    df_filtered = df.loc[[type(x) == shapely.geometry.polygon.Polygon for x in df.geometry]]
    return(df_filtered)

#creates new column for coordenates in a list of list format
#assumes only polygons in dataframe
def replace_coords(df):
    df['coordinates_shape_exterior'] = np.nan
    df['coordinates_shape_exterior'] = df['coordinates_shape_exterior'].astype('object')

    for i in df.index:
        #get geometry object for row i
        geom = df.at[i, 'geometry']
        #list the coordinate points as tuples
        tmp = list(geom.exterior.coords)
        #transform list of tuples to list of lists and save to dataframe
        df.at[i,'coordinates_shape_exterior'] = [list(i) for i in tmp]

    return(df)

3.2.2. Import GEOjson regions

The geojson dataset is loaded into a dataframe with the following structure:

id objectType classification name geometry unique shape id type of shape (e.g. annotation) all annotation information from qupath shape name if given contains information relevant for shape

The geojson should besides containing individual segemnted shapes contain 3 points annotated as calib1, calib2, calib3 that will be used as calibration points for generating the XML

[3]:

df = geopandas.read_file("test_data/cellculture_example/annotated_regions_Qupath.geojson")
df

Skipping field color: unsupported OGR type: 1

[3]:

	id	objectType	name	geometry
0	9287277d-e46f-47e9-aa3f-c540b3318b5e	annotation	region2	POLYGON ((507 1524, 506.6 1539.01, 505.39 1553...
1	ab63cfd3-17d0-4dd3-b8e6-95172dda64de	annotation	Region1	POLYGON ((1789 990, 1730 1153, 1744 1467, 1944...
2	b60d2cf7-963e-42ae-8a2a-69ff289d29db	annotation	calib1	POINT (343.24 368.53)
3	56bdd076-ac9a-4950-97f3-b8c2268ee090	annotation	calib3	POINT (361.78 2301.51)
4	03d9bb6e-16b1-4cd9-9181-86da90fb98bf	annotation	calib2	POINT (1353.77 1165.83)

3.2.3. Calibration points

[4]:

#assumes user will always label their calibration points like this
caliblist = get_calib_points(['calib1','calib2','calib3'],df)
print(caliblist)

[[ 343.24  368.53]
 [1353.77 1165.83]
 [ 361.78 2301.51]]

3.2.4. Clean up Dataframe

Remove non-polygon shapes, extract polygon coordinates, and parse annotation for easier use

[5]:

df_poly = remove_non_polygons(df)
df_poly = replace_coords(df_poly)
df_poly

/Users/sophia/mambaforge/envs/pylmd_docs/lib/python3.10/site-packages/geopandas/geodataframe.py:1819: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)
/Users/sophia/mambaforge/envs/pylmd_docs/lib/python3.10/site-packages/geopandas/geodataframe.py:1819: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)

[5]:

	id	objectType	name	geometry	coordinates_shape_exterior
0	9287277d-e46f-47e9-aa3f-c540b3318b5e	annotation	region2	POLYGON ((507 1524, 506.6 1539.01, 505.39 1553...	[[507.0, 1524.0], [506.6, 1539.01], [505.39, 1...
1	ab63cfd3-17d0-4dd3-b8e6-95172dda64de	annotation	Region1	POLYGON ((1789 990, 1730 1153, 1744 1467, 1944...	[[1789.0, 990.0], [1730.0, 1153.0], [1744.0, 1...

3.2.5. Generate shape collection

[6]:

shape_collection = Collection(calibration_points = caliblist)
shape_collection.orientation_transform = np.array([[1,0 ], [0,-1]])

[7]:

for i in df_poly.index:
    shape_collection.new_shape(df_poly.loc[i,'coordinates_shape_exterior'], well = "well1") # can define a well if so wished otherwise leave out

[8]:

#print some statistics on the shapes included in the collection and visualize results
print(shape_collection.stats())
shape_collection.plot(calibration = True)

===== Collection Stats =====
Number of shapes: 2
Number of vertices: 117
============================
Mean vertices: 58
Min vertices: 16
5% percentile vertices: 20
Median vertices: 58
95% percentile vertices: 97
Max vertices: 101
None

../../_images/pages_notebooks_generate_xml_from_qupath_export_13_1.png

3.2.6. write to XML

[9]:

shape_collection.save("./test_data/cellculture_example/shapes_2.xml")

[ 34324. -36853.]
[ 135377. -116583.]
[  36178. -230151.]