简体   繁体   中英

How to make gene_expression matrix from segmented image and XY cordinates of identified genes in spatial transcriptomics data?

I am new to image analysis using python and am stuck with the following problem.

I have segmented images like the following: enter image description here

Using cv2.connectedcomponentswithstats I extracted the stats of the objects using the following code:

nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(thresh_image, connectivity=4)

I have another table with xy coordinates and intensity values identifying point objects on the image from different channels. The table looks like the following:

Spot x y target_id intensity quality target_name
0 111.0 49.0 1 422.68780 0.519918 Act1
1 50.0 132.0 2 532.04517 1.630690 Tub2
2 427.0 141.0 3 512.33620 1.317758 Ser2
3 380.0 171.0 4 377.43110 0.911759 Pol2
4 134.0 190.0 1 480.68900 0.884888 Act1
... ... ... ... ... ... ...

As output I need a sparse expression matrix which basically locates each of the spots in the table to the segmented objects like the following(The table is only representative and not accurate):

Gene Cell1 Cell2 Cell3 Cell4 ...
Act1 NAN NAN 422.68780 480.68900 ...
Tub2 532.04517 NAN NAN NAN ...
Pol2 NAN 377.43110 NAN NAN ...
Ser2 NAN 377.43110 NAN NAN ...
... ... ... ... ... ...

Any help/guidance will be very helpful.

I'm not as familiar with the cv2 package but if I am reading your question correctly, I think that you can do this with skimage.measure (//scikit-image.org/docs/dev/api/skimage.measure.html).

Assuming that your spot coordinates are in a pandas data frame called df_spots, I think something along the lines of:

from skimage.measure import label, regionprops_table
import numpy as np
import pandas as pd

#make a labeled image from the binary segmented image. Each individual object
#in the output image will have a unique label
labeled_img = label(thresh_image) # labeled image from segmented image
rp_table = regionprops_table(labeled_img, properties=['area', 'centroid']) # you can get whatever other stats for the segmented image that you may be interested in.

# from there you can iterate over the spot coordinates and determine the corresponding labeled object

# make arrays from the spot coordinates
xcoords = np.array(df_spots.x).astype('int') # need to change from float to int
ycorods = np.array(df_spots.y).astype('int')

# iterate over the spots and determine which labeled object they are located in
labeled_obj_lst = [] # list to hold associated segmented object index for each spot
for spot_iter in range(len(xcoords)):
    x = xcoords[spot_iter]
    y = ycoords[spot_iter]
    labeled_obj_lst.append(labeled_img[y, x])

# add the cell_idx parameter to df_spots
df_spots.loc[:, 'labeled_obj'] = labeled_obj_lst

From there it is a bit of data clean up to get to the sparse matrix that you are after but I think that you can hopefully get the data that you are after with this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM