简体   繁体   English

比较两个数据帧并基于r中的值过滤值

[英]Comparing two data frames and filter the values based on their values in r

I want to compare two data frames in R with same column names (df1 & df2). 我想比较R中具有相同列名(df1&df2)的两个数据帧。 Based on the values in each of the columns in one of them (df2) I want to filter the other one (df1). 基于其中一列(df2)中每一列的值,我要过滤另一列(df1)。 I need to eliminate rows in df1 that are greater or equal than the values in df2 with respect to each column name. 对于每个列名,我需要消除df1中大于或等于df2中值的行。 In other words, in need to produce res1 below: 换句话说,需要在下面生成res1:

df1 <- data.frame( v1 = c(1,2,3,4), v2 = c(2, 10, 5, 11), v3=c(20, 25, 23, 2), v4=c(1,2,1,3) )  

> df1
  v1 v2 v3 v4
1  1  2 20  1
2  2 10 25  2
3  3  5 23  1
4  4 11  2  3

df2 <- data.frame(v1 = 4, v2 = 10, v3 =30, v4 = 3)

> df2
  v1 v2 v3 v4
1  4 10 30 3

So, the desired output res1 is generated by comparing each row in df1 with df2 based on column names and eliminating the rows in df1 that are greater or equal than specific column threshold defined in df2: 因此,通过基于列名将df1中的每一行与df2相比较并消除大于或等于df2中定义的特定列阈值的行,从而生成所需的输出res1:

> res1
  v1 v2 v3 v4
1  1  2 20  1
2  3  5 23  1 

We can use mapply with < sign to compare the two data frames, and use rowSums to index for subseting, ie 我们可以使用带<符号的mapply比较两个数据帧,并使用rowSums为子集建立索引,即

df1[rowSums(mapply(`<`, df1, df2)) == ncol(df1),]
#  v1 v2 v3 v4
#1  1  2 20  1
#3  3  5 23  1

Additionally, a fully Vectorized translation of the above can be (compliments of @RonakShah), 此外,上述内容的完全矢量化翻译可以是(@RonakShah的补充),

df1[rowSums(df1 < df2[rep(1, nrow(df1)), ]) == ncol(df1), ]

We can use apply row-wise and check if all the elements in the row are less than the one in other dataframe 我们可以逐行apply并检查行中的所有元素是否小于其他数据框中的元素

df1[t(apply(df1, 1, function(x) all(x < df2[1, ]))), ]

#  v1 v2 v3 v4
#1  1  2 20  1
#3  3  5 23  1

Here is another option using Reduce with Map 这是在Map使用Reduce另一种选择

df1[Reduce(`&`, Map(`<`, df1, df2)),]
#   v1 v2 v3 v4
#1  1  2 20  1
#3  3  5 23  1

Or using tidyverse 或使用tidyverse

library(dplyr)
library(purrr)
map2(df1, df2, `<`) %>% 
       reduce(`&`) %>% 
       df1[.,]
#   v1 v2 v3 v4
#1  1  2 20  1
#3  3  5 23  1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM