簡體   English   中英

計算其他兩個列值之間(范圍)的列值

[英]Count column values that is in between (range) of two other column values

我有兩個數據幀(Droplets 和 Nucleus),其中包含來自圖像中數千個對象的數據,如下所示:

head(Droplets)
  class_name object_id centroid_y centroid_x
  <chr>          <dbl>      <dbl>      <dbl>
1 Droplet            1         47        621
2 Droplet            2        173        106
3 Droplet            3        158        949
4 Droplet            4        176        627
5 Droplet            5        619        154
6 Droplet            6        631       1361


 head(Nucleus)
  class_name object_id  area bbox_y_start bbox_x_start bbox_y_end bbox_x_end
  <chr>          <dbl> <dbl>        <dbl>        <dbl>      <dbl>      <dbl>
1 Nucleus            1  8973            0           95        102        213
2 Nucleus            2  1592            0          189         36        257
3 Nucleus            3  2980            0          256         43        348
4 Nucleus            4  4664            0          404         93        490
5 Nucleus            5  3973            0          486         79        560
6 Nucleus            6   737            0          564         16        635

液滴是核內的點。 所有的液滴都在一個核內,但一個核也可以有 0 個液滴。 我試圖找出一種方法來根據它們的位置計算 Nucleus 內部有多少液滴。 我相信 Droplet 是一個點,而 Nucleus 可以是多邊形。 我讀過 point.in.polygon 。 我還嘗試查看 centroid_y 和 centroid_x 是否都在 bbox 的范圍內。 但我不是 R 忍者,我不知道如何開始。所需的輸出將是這樣的:

  class_name object_id Droplets_count
1    Nucleus         1              1
2    Nucleus         2              2
3    Nucleus         3              3
4    Nucleus         4              0
5    Nucleus         5              0
6    Nucleus         6              1

有什么簡單的方法可以做到嗎? 謝謝!

您可以通過根據邊界框限制檢查其 cetroid 將每個液滴分配給特定的核:

Droplets$Nucleus <- unlist(mapply(function(x, y) {
        result <- which(Nucleus$bbox_x_end >= x & 
                        Nucleus$bbox_x_start <= x & 
                        Nucleus$bbox_y_end >= y & 
                        Nucleus$bbox_y_start <= y)
        if(length(result) == 0) return(0)
        return(result[1])
       }, 
       x = Droplets$centroid_x, y = Droplets$centroid_y, SIMPLIFY = TRUE))

然后,您可以計算每個核內的液滴數量,並將其分配給Nucleus數據框中的一列,如下所示:

Nucleus$Droplets <- sapply(seq(nrow(Nucleus)), function(i) {
  length(which(Droplets$Nucleus == i))})

不幸的是,在您提供給我們的示例數據中, Droplets顯示的所有Droplets都沒有落入Nucleus任何邊界框內。 因此,我稍微修改了數據框以演示此代碼的實際效果:

Droplets
#>   class_name object_id centroid_y centroid_x
#> 1    Droplet         1         21        152
#> 2    Droplet         2          6        126
#> 3    Droplet         3         36        301
#> 4    Droplet         4         66        426
#> 5    Droplet         5          8        599
#> 6    Droplet         6         12        602

Nucleus
#>   class_name object_id area bbox_y_start bbox_x_start bbox_y_end bbox_x_end
#> 1    Nucleus         1 8973            0           95        102        213
#> 2    Nucleus         2 1592            0          189         36        257
#> 3    Nucleus         3 2980            0          256         43        348
#> 4    Nucleus         4 4664            0          404         93        490
#> 5    Nucleus         5 3973            0          486         79        560
#> 6    Nucleus         6  737            0          564         16        635

當我們在這兩個數據幀上運行上面的代碼時,它們變成:

Droplets
#>   class_name object_id centroid_y centroid_x Nucleus
#> 1    Droplet         1         21        152       1
#> 2    Droplet         2          6        126       1
#> 3    Droplet         3         36        301       3
#> 4    Droplet         4         66        426       4
#> 5    Droplet         5          8        599       6
#> 6    Droplet         6         12        602       6

Nucleus
#>   class_name object_id area bbox_y_start bbox_x_start bbox_y_end bbox_x_end Droplets
#> 1    Nucleus         1 8973            0           95        102        213        2
#> 2    Nucleus         2 1592            0          189         36        257        0
#> 3    Nucleus         3 2980            0          256         43        348        1
#> 4    Nucleus         4 4664            0          404         93        490        1
#> 5    Nucleus         5 3973            0          486         79        560        0
#> 6    Nucleus         6  737            0          564         16        635        2

使用的數據

Droplets <- structure(list(class_name = c("Droplet", "Droplet", "Droplet", 
                                          "Droplet", "Droplet", "Droplet"), 
                           object_id = 1:6, 
                           centroid_y = c(21L, 6L, 36L, 66L, 8L, 12L), 
                           centroid_x = c(152L, 126L, 301L, 426L, 599L, 602L)), 
                      class = "data.frame", row.names = c(NA, -6L))

Nucleus <- structure(list(class_name = c("Nucleus", "Nucleus", "Nucleus", 
                              "Nucleus", "Nucleus", "Nucleus"), 
                          object_id = 1:6, 
                          area = c(8973L, 1592L, 2980L, 4664L, 3973L, 737L), 
                          bbox_y_start = c(0L, 0L, 0L, 0L, 0L, 0L), 
                          bbox_x_start = c(95L, 189L, 256L, 404L, 486L, 564L), 
                          bbox_y_end = c(102L, 36L, 43L, 93L, 79L, 16L), 
                          bbox_x_end = c(213L, 257L, 348L, 490L, 560L, 635L)), 
                     class = "data.frame", row.names = c(NA, -6L))

data.table方法

library(data.table)
# convert to data.table format using
#   setDT(Droplets); setDT(Nucleus)

# Perform non-equi left join
ans <- Droplets[Nucleus, on = .(centroid_y >= bbox_y_start,
                         centroid_y <= bbox_y_end,
                         centroid_x >= bbox_x_start,
                         centroid_x <= bbox_x_end)][]
# summarise
ans[, .(Droplets_count = uniqueN(object_id, na.rm = TRUE)), 
        by = .(Nucleus_id = i.object_id)]

   Nucleus_id Droplets_count
1:          1              2
2:          2              0
3:          3              1
4:          4              1
5:          5              0
6:          6              2

使用的樣本數據

library(data.table)
Droplets <- fread("class_name object_id centroid_y centroid_x
    Droplet         1         21        152
    Droplet         2          6        126
    Droplet         3         36        301
    Droplet         4         66        426
    Droplet         5          8        599
    Droplet         6         12        602")

Nucleus <- fread("class_name object_id area bbox_y_start bbox_x_start bbox_y_end bbox_x_end
    Nucleus         1 8973            0           95        102        213
    Nucleus         2 1592            0          189         36        257
    Nucleus         3 2980            0          256         43        348
    Nucleus         4 4664            0          404         93        490
    Nucleus         5 3973            0          486         79        560
    Nucleus         6  737            0          564         16        635")

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM