[英]Count column values that is in between (range) of two other column values
我有兩個數據幀(Droplets 和 Nucleus),其中包含來自圖像中數千個對象的數據,如下所示:
head(Droplets)
class_name object_id centroid_y centroid_x
<chr> <dbl> <dbl> <dbl>
1 Droplet 1 47 621
2 Droplet 2 173 106
3 Droplet 3 158 949
4 Droplet 4 176 627
5 Droplet 5 619 154
6 Droplet 6 631 1361
head(Nucleus)
class_name object_id area bbox_y_start bbox_x_start bbox_y_end bbox_x_end
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Nucleus 1 8973 0 95 102 213
2 Nucleus 2 1592 0 189 36 257
3 Nucleus 3 2980 0 256 43 348
4 Nucleus 4 4664 0 404 93 490
5 Nucleus 5 3973 0 486 79 560
6 Nucleus 6 737 0 564 16 635
液滴是核內的點。 所有的液滴都在一個核內,但一個核也可以有 0 個液滴。 我試圖找出一種方法來根據它們的位置計算 Nucleus 內部有多少液滴。 我相信 Droplet 是一個點,而 Nucleus 可以是多邊形。 我讀過 point.in.polygon 。 我還嘗試查看 centroid_y 和 centroid_x 是否都在 bbox 的范圍內。 但我不是 R 忍者,我不知道如何開始。所需的輸出將是這樣的:
class_name object_id Droplets_count
1 Nucleus 1 1
2 Nucleus 2 2
3 Nucleus 3 3
4 Nucleus 4 0
5 Nucleus 5 0
6 Nucleus 6 1
有什么簡單的方法可以做到嗎? 謝謝!
您可以通過根據邊界框限制檢查其 cetroid 將每個液滴分配給特定的核:
Droplets$Nucleus <- unlist(mapply(function(x, y) {
result <- which(Nucleus$bbox_x_end >= x &
Nucleus$bbox_x_start <= x &
Nucleus$bbox_y_end >= y &
Nucleus$bbox_y_start <= y)
if(length(result) == 0) return(0)
return(result[1])
},
x = Droplets$centroid_x, y = Droplets$centroid_y, SIMPLIFY = TRUE))
然后,您可以計算每個核內的液滴數量,並將其分配給Nucleus
數據框中的一列,如下所示:
Nucleus$Droplets <- sapply(seq(nrow(Nucleus)), function(i) {
length(which(Droplets$Nucleus == i))})
不幸的是,在您提供給我們的示例數據中, Droplets
顯示的所有Droplets
都沒有落入Nucleus
任何邊界框內。 因此,我稍微修改了數據框以演示此代碼的實際效果:
Droplets
#> class_name object_id centroid_y centroid_x
#> 1 Droplet 1 21 152
#> 2 Droplet 2 6 126
#> 3 Droplet 3 36 301
#> 4 Droplet 4 66 426
#> 5 Droplet 5 8 599
#> 6 Droplet 6 12 602
Nucleus
#> class_name object_id area bbox_y_start bbox_x_start bbox_y_end bbox_x_end
#> 1 Nucleus 1 8973 0 95 102 213
#> 2 Nucleus 2 1592 0 189 36 257
#> 3 Nucleus 3 2980 0 256 43 348
#> 4 Nucleus 4 4664 0 404 93 490
#> 5 Nucleus 5 3973 0 486 79 560
#> 6 Nucleus 6 737 0 564 16 635
當我們在這兩個數據幀上運行上面的代碼時,它們變成:
Droplets
#> class_name object_id centroid_y centroid_x Nucleus
#> 1 Droplet 1 21 152 1
#> 2 Droplet 2 6 126 1
#> 3 Droplet 3 36 301 3
#> 4 Droplet 4 66 426 4
#> 5 Droplet 5 8 599 6
#> 6 Droplet 6 12 602 6
Nucleus
#> class_name object_id area bbox_y_start bbox_x_start bbox_y_end bbox_x_end Droplets
#> 1 Nucleus 1 8973 0 95 102 213 2
#> 2 Nucleus 2 1592 0 189 36 257 0
#> 3 Nucleus 3 2980 0 256 43 348 1
#> 4 Nucleus 4 4664 0 404 93 490 1
#> 5 Nucleus 5 3973 0 486 79 560 0
#> 6 Nucleus 6 737 0 564 16 635 2
使用的數據
Droplets <- structure(list(class_name = c("Droplet", "Droplet", "Droplet",
"Droplet", "Droplet", "Droplet"),
object_id = 1:6,
centroid_y = c(21L, 6L, 36L, 66L, 8L, 12L),
centroid_x = c(152L, 126L, 301L, 426L, 599L, 602L)),
class = "data.frame", row.names = c(NA, -6L))
Nucleus <- structure(list(class_name = c("Nucleus", "Nucleus", "Nucleus",
"Nucleus", "Nucleus", "Nucleus"),
object_id = 1:6,
area = c(8973L, 1592L, 2980L, 4664L, 3973L, 737L),
bbox_y_start = c(0L, 0L, 0L, 0L, 0L, 0L),
bbox_x_start = c(95L, 189L, 256L, 404L, 486L, 564L),
bbox_y_end = c(102L, 36L, 43L, 93L, 79L, 16L),
bbox_x_end = c(213L, 257L, 348L, 490L, 560L, 635L)),
class = "data.frame", row.names = c(NA, -6L))
data.table
方法
library(data.table)
# convert to data.table format using
# setDT(Droplets); setDT(Nucleus)
# Perform non-equi left join
ans <- Droplets[Nucleus, on = .(centroid_y >= bbox_y_start,
centroid_y <= bbox_y_end,
centroid_x >= bbox_x_start,
centroid_x <= bbox_x_end)][]
# summarise
ans[, .(Droplets_count = uniqueN(object_id, na.rm = TRUE)),
by = .(Nucleus_id = i.object_id)]
Nucleus_id Droplets_count
1: 1 2
2: 2 0
3: 3 1
4: 4 1
5: 5 0
6: 6 2
使用的樣本數據
library(data.table)
Droplets <- fread("class_name object_id centroid_y centroid_x
Droplet 1 21 152
Droplet 2 6 126
Droplet 3 36 301
Droplet 4 66 426
Droplet 5 8 599
Droplet 6 12 602")
Nucleus <- fread("class_name object_id area bbox_y_start bbox_x_start bbox_y_end bbox_x_end
Nucleus 1 8973 0 95 102 213
Nucleus 2 1592 0 189 36 257
Nucleus 3 2980 0 256 43 348
Nucleus 4 4664 0 404 93 490
Nucleus 5 3973 0 486 79 560
Nucleus 6 737 0 564 16 635")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.