基于满足不同标准范围的 2 列保留 dataframe 中的行； dataframe中有27行范围

Question

I have a dataframe dat which I need to extract the entire row based on 2 columns ut and ctz which must satisfy their respective rows of ranges in data.frame range_criteria simultaneously;我有一个 dataframe数据，我需要根据ut和ctz两列提取整行，这两个列必须同时满足 data.frame range_criteria中各自的范围行； the ranges are different for ut and ctz and they must satisfy their respective ranges . ut和ctz的范围不同，它们必须满足各自的范围。 If either the ut or ctz is out of the range, the entire row will be discarded.如果ut或ctz超出范围，则将丢弃整行。

I have been cracking my brain over this for for 12 hours, I must make sure each row of ut and dat is checked by every row of the respective range_criteria .我已经为此绞尽脑汁了 12 个小时，我必须确保ut和dat的每一行都由相应 range_criteria 的每一行检查。 I know I have to loop, but I am not sure how... please help!我知道我必须循环，但我不确定如何......请帮忙！

dat <- data.frame(name = c('Tom', "Harry", "David", "Daniel", "Harri", "Davidi", "Daniely"),
             ut = c(2.4, 3.2, 3.5,9.5,5.2,6.0,45),
             ctz = c(1, 6.0, 3.5, 5.1, 51.5, 6.6, 7))

range_criteria <- data.frame(ut_min = c(0.0, 0.5, 1.0, 2.0, 7.2, 9.0, 21.0),
    ut_max = c(5, 10, 15, 25, 30, 35, 50),
    ctz_min = c(0, 1, 2, 3.2, 4.3, 6.3, 6.9),
    ctz_max = c(5, 5.5, 6.1, 6.2, 6.4 ,6.5, 7.8)

The expected outcome should be:预期结果应该是：

interest <- data.frame(name = c('Tom', "David", "Daniely" ),
                 ut = c(2.4, 3.5,45),
                 ctz = c(1, 3.5, 7))

Thank you so much !!太感谢了！！

Answer 1

Based on your description it sounds like you want the i th row of dat to satisfy both ranges specified in the i th row of range_criteria , is that correct?根据您的描述，听起来您希望第i行dat满足第i行range_criteria中指定的两个范围，对吗？

If so, there's no need to loop (explicitly).如果是这样，则无需循环（明确地）。 R's vectorized approach makes this work pretty easily: R 的矢量化方法使这项工作非常容易：

dat <- data.frame(name = c('Tom', "Harry", "David", "Daniel", "Harri", "Davidi", "Daniely"),
                  ut = c(2.4, 3.2, 3.5,9.5,5.2,6.0,45),
                  ctz = c(1, 6.0, 3.5, 5.1, 51.5, 6.6, 7))

rc <- data.frame(ut_min = c(0.0, 0.5, 1.0, 2.0, 7.2, 9.0, 21.0),
                             ut_max = c(5, 10, 15, 25, 30, 35, 50),
                             ctz_min = c(0, 1, 2, 3.2, 4.3, 6.3, 6.9),
                             ctz_max = c(5, 5.5, 6.1, 6.2, 6.4 ,6.5, 7.8))

dat[dat$ut >= rc$ut_min & dat$ut <= rc$ut_max & dat$ctz >= rc$ctz_min & dat$ctz <= rc$ctz_max,]

This also returns "Daniel" in addition to the other three names you mentioned, but looking at the data I think that's correct.除了您提到的其他三个名称之外，这还返回“丹尼尔”，但查看数据我认为这是正确的。

Alternately you could use a package designed for data manipulation like dplyr or data.table to do the same thing a bit more smoothly.或者，您可以使用为dplyr或data.table等数据操作而设计的 package 来更顺利地完成相同的操作。

library(data.table)

both <- cbind(dat, rc)
setDT(both)
interest <- both[between(ut, ut_min, ut_max) & between(ctz, ctz_min, ctz_max)]

or或者

library(dplyr)

both <- bind_cols(dat, rc)

interest <- both %>%
  filter(ut >= ut_min & ut <= ut_max & ctz >= ctz_min & ctz <= ctz_max)

基于满足不同标准范围的 2 列保留 dataframe 中的行； dataframe中有27行范围

问题描述

1 个解决方案

解决方案1
0 2022-09-09 19:41:35

基于满足不同标准范围的 2 列保留 dataframe 中的行； dataframe中有27行范围

问题描述

1 个解决方案

解决方案1 0 2022-09-09 19:41:35

解决方案1
0 2022-09-09 19:41:35