[英]Filtering rows based on partial matching between a data frame and a vector
I have a data frame and want to filter it based on the partial match of names in the first column with the names in the vector.我有一个数据框,想根据第一列中的名称与向量中的名称的部分匹配来过滤它。
nam <- c('mmu_mir-1-3p','mmu_mir-1-5p','mmu-mir-3-5p','mir-4','mmu-mir-6-3p') #factor
aa <- c('12854','36','5489','54485','2563') #numeric
df <- data.frame(nam,aa)
vector <- c('mir-1','mir-3','mir-6')
I need to have rows in the new data frame where names in df$nam
are partially matching with the names in vector
.我需要在新数据框中有行,其中
df$nam
中的名称与vector
的名称部分匹配。 So new_df
should look like this.所以
new_df
应该是这样的。
new_nam <- c('mmu_mir-1-3p','mmu_mir-1-5p','mmu-mir-3-5p','mmu-mir-6-3p') #factor
new_aa <- c('12854','36','5489','2563') #numeric
new_df <- data.frame(new_nam,new_aa)
We can paste
the elements of 'vector' into a single string collapsed by |
我们可以将 'vector' 的元素
paste
到由|
折叠的单个字符串中|
and usse that in grepl
or str_detect
to filter
the rows并在
grepl
或str_detect
来filter
行
library(dplyr)
library(stringr)
df %>%
filter(str_detect(nam, str_c(vector, collapse="|")))
# nam aa
#1 mmu_mir-1-3p 12854
#2 mmu_mir-1-5p 36
#3 mmu-mir-3-5p 5489
#4 mmu-mir-6-3p 2563
In base R
, this can be done with subset/grepl
在
base R
,这可以通过subset/grepl
来完成
subset(df, grepl(paste(vector, collapse= "|"), nam))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.