How to make gene_expression matrix from segmented image and XY cordinates of identified genes in spatial transcriptomics data?

Question

I am new to image analysis using python and am stuck with the following problem.

I have segmented images like the following: enter image description here

Using cv2.connectedcomponentswithstats I extracted the stats of the objects using the following code:

nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(thresh_image, connectivity=4)

I have another table with xy coordinates and intensity values identifying point objects on the image from different channels. The table looks like the following:

Spot	x	y	target_id	intensity	quality	target_name
0	111.0	49.0	1	422.68780	0.519918	Act1
1	50.0	132.0	2	532.04517	1.630690	Tub2
2	427.0	141.0	3	512.33620	1.317758	Ser2
3	380.0	171.0	4	377.43110	0.911759	Pol2
4	134.0	190.0	1	480.68900	0.884888	Act1
...	...	...	...	...	...	...

As output I need a sparse expression matrix which basically locates each of the spots in the table to the segmented objects like the following(The table is only representative and not accurate):

Gene	Cell1	Cell2	Cell3	Cell4	...
Act1	NAN	NAN	422.68780	480.68900	...
Tub2	532.04517	NAN	NAN	NAN	...
Pol2	NAN	377.43110	NAN	NAN	...
Ser2	NAN	377.43110	NAN	NAN	...
...	...	...	...	...	...

Any help/guidance will be very helpful.

Answer 1

I'm not as familiar with the cv2 package but if I am reading your question correctly, I think that you can do this with skimage.measure (//scikit-image.org/docs/dev/api/skimage.measure.html).

Assuming that your spot coordinates are in a pandas data frame called df_spots, I think something along the lines of:

from skimage.measure import label, regionprops_table
import numpy as np
import pandas as pd

#make a labeled image from the binary segmented image. Each individual object
#in the output image will have a unique label
labeled_img = label(thresh_image) # labeled image from segmented image
rp_table = regionprops_table(labeled_img, properties=['area', 'centroid']) # you can get whatever other stats for the segmented image that you may be interested in.

# from there you can iterate over the spot coordinates and determine the corresponding labeled object

# make arrays from the spot coordinates
xcoords = np.array(df_spots.x).astype('int') # need to change from float to int
ycorods = np.array(df_spots.y).astype('int')

# iterate over the spots and determine which labeled object they are located in
labeled_obj_lst = [] # list to hold associated segmented object index for each spot
for spot_iter in range(len(xcoords)):
    x = xcoords[spot_iter]
    y = ycoords[spot_iter]
    labeled_obj_lst.append(labeled_img[y, x])

# add the cell_idx parameter to df_spots
df_spots.loc[:, 'labeled_obj'] = labeled_obj_lst

From there it is a bit of data clean up to get to the sparse matrix that you are after but I think that you can hopefully get the data that you are after with this.

How to make gene_expression matrix from segmented image and XY cordinates of identified genes in spatial transcriptomics data?

Question

1 answers

solution1
0 2021-10-26 02:32:22

How to make gene_expression matrix from segmented image and XY cordinates of identified genes in spatial transcriptomics data?

Question

1 answers

solution1 0 2021-10-26 02:32:22

solution1
0 2021-10-26 02:32:22