简体   繁体   English

在 R 中找到重叠值

[英]finding the overleap values in R

d1<-data.frame(ID=c(1:6),
               Locx1=c(100,121,146,194,162,182),
               Locx2=c(148,170,184,236,196,190),
               Locy1=c(119,173,104,164,188,142),
               Locy2=c(168,180,120,210,190,213))

In the above data, Locx1 is the starting point of x and Locx2 is the endpoint of x, Locy1 is the starting point of y and Locy2 is the endpoint of y.另外,在上述数据, Locx1是x的起点和Locx2是x的端点, Locy1是y的起点和Locy2是y的端点。 I want to find the y values in which the 50% (and more) of the Locy1 and Locy2 is between the Locx1 and Locx2 in R. How can I do that?我想找到的y值在50%(及以上) Locy1Locy2之间是Locx1Locx2在R.我怎么能这样做?

For instance, the 1st row fits this example.例如,第一行适合这个例子。 the starting point of y is (119) between Locx1 and Locx2 and (148-119)/(168-119) is greater than %50. y 的起点是Locx1Locx2之间的 (119) 并且 (148-119)/(168-119) 大于 %50。

Thanks谢谢

Dividing lengths of the intersect s of x and y by the full y lengths.xyintersect s 的lengths除以完整的y长度。

## helper FUNs
intl <- function(i) length(i[[1]]:i[[2]])  ## interval length
seq1 <- function(i) i[[1]]:i[[2]]  ## seq from `:`

res <- lengths(Map(intersect, apply(y, 1, seq1), apply(x, 1, seq1))) / apply(y, 1, intl)
# [1] 0.6000000 0.0000000 0.0000000 0.3617021 1.0000000 0.1250000
res > .5
# [1]  TRUE FALSE FALSE FALSE  TRUE FALSE

Data:数据:

d1 <- structure(list(ID = 1:6, Locx1 = c(100, 121, 146, 194, 162, 182
), Locy1 = c(119, 173, 104, 164, 188, 142), Locx2 = c(148, 170, 
184, 236, 196, 190), Locy2 = c(168, 180, 120, 210, 190, 213)), class = "data.frame", row.names = c(NA, 
-6L))

if I get you right如果我猜对了

df %>% 
  filter((pmin(Locy2, Locx2) - pmax(Locy1, Locx1)) / (Locx2 - Locx1) >= 0.5)

  ID Locx1 Locx2 Locy1 Locy2
1  1   100   148   119   168
2  6   182   190   142   213

A somewhat more long winded but hopefully also more transparent solution which yields the same result as @jay.sf一个有点冗长但希望也更透明的解决方案,它产生与@jay.sf 相同的结果

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

d1<-data.frame(ID=c(1:6),
               Locx1=c(100,121,146,194,162,182),
               Locx2=c(148,170,184,236,196,190),
               Locy1=c(119,173,104,164,188,142),
               Locy2=c(168,180,120,210,190,213))

d2 <- d1 %>% mutate(seqx = purrr::map2(Locx1, Locx2, .f = ~seq(.x, .y, 1)),
                    seqy = purrr::map2(Locy1, Locy2, .f = ~seq(.x, .y, 1)),
                    intersection = purrr::map2(seqx, seqy, .f = ~intersect(.x, .y)),
                    overlap = purrr::map2_dbl(intersection, seqy, .f = ~length(.x)/length(.y)),
                    my_condition = overlap >= 0.5
                    ) 
d2 %>% select(-contains('seq'), -intersection)
#>   ID Locx1 Locx2 Locy1 Locy2   overlap my_condition
#> 1  1   100   148   119   168 0.6000000         TRUE
#> 2  2   121   170   173   180 0.0000000        FALSE
#> 3  3   146   184   104   120 0.0000000        FALSE
#> 4  4   194   236   164   210 0.3617021        FALSE
#> 5  5   162   196   188   190 1.0000000         TRUE
#> 6  6   182   190   142   213 0.1250000        FALSE

Created on 2020-09-03 by the reprex package (v0.3.0)由 reprex 包 (v0.3.0) 于 2020 年 9 月 3 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM