简体   繁体   English

如何使用str_which从Vector中选择包含字符串的行

[英]How to use str_which to select rows which contain a string from a Vector

I have a Table like this 我有这样的桌子

name    <- c("Goku","Vegeta","Jiren","Gohan","Piccolo","Kurinin","Trunks","Buu","Frieza","Cell","Muten","Gotens")
surname <- c("San","San","San","San","San","San","San","Majin","Evil","San","Roshi","San")
email   <- c("goku@gmail.com","vegeta@gmail.com","jiren@patrol.ch","gohan@gmail.com","piccolo@gmail.com","kurinin@gmail.com","Trunks@gmail.com","buu@babidi.com","frieza@rampage.usa","cell@rampage.usa","muten@gmail.com","gotens@gmail.com")

table <- data.frame(name, surname, email, stringsAsFactors = FALSE)

And I have a Vector with different endings in email adresses. 我在电子邮件地址中有一个带有不同结尾的Vector。 I want to find all rows which use email adresses with this endings 我想找到所有以电子邮件地址结尾的行

searchvector = c("@patrol.ch", "@babidi.com", "@rampage.usa")
searchvector = as.character(searchvector)

There are two ways I tried to search for the rows containg the searchvector: 我尝试通过两种方式搜索包含searchvector的行:

A. Using str_detect: A.使用str_detect:

table[str_detect(table$email, "@patrol.ch|@babidi.com|@rampage.usa"), ]

This gives me the correct result 这给我正确的结果

name surname              email  
3   Jiren     San    jiren@patrol.ch  
8     Buu   Majin     buu@babidi.com  
9  Frieza    Evil frieza@rampage.usa  
10   Cell     San   cell@rampage.usa 

B. But when using str_which, I always only get two rows B.但是当使用str_which时,我总是只得到两行

table[str_which(table$email, searchvector), ]
table[str_which(table$email, c("@patrol.ch", "@babidi.com", "@rampage.usa")), ]

I get this result in both cases: 在两种情况下,我都得到此结果:

name surname email  
8 Buu Majin buu@babidi.com
9 Frieza Evil frieza@rampage.usa

Why is that? 这是为什么? And how can I use str_which to do what I want to accomplish? 以及如何使用str_which完成我想完成的工作?

According to ?str_which , it is a wrapper function 根据?str_which ,它是一个包装函数

str_which() is a wrapper around which(str_detect(x, pattern)), and is equivalent to grep(pattern, x). str_which()是which(str_detect(x,pattern))的包装,并且等效于grep(pattern,x)。

Inorder to get the same output, we need a single string in pattern . 为了获得相同的输出,我们需要在pattern使用单个字符串。 It can he created with paste and specifying the collapse argument to | 他可以通过paste并在|指定collapse参数来创建它|

table[str_which(table$email, paste(searchvector, collapse="|")), ]
#     name surname              email
#3   Jiren     San    jiren@patrol.ch
#8     Buu   Majin     buu@babidi.com
#9  Frieza    Evil frieza@rampage.usa
#10   Cell     San   cell@rampage.usa

just like it was created for str_detect in the OP's post 就像在OP的帖子中为str_detect创建的str_detect

If we use the vector as pattern in str_detect 如果我们在str_detect向量用作pattern

table[str_detect(table$email, searchvector),]
#   name surname              email
#8    Buu   Majin     buu@babidi.com
#9 Frieza    Evil frieza@rampage.usa

returns the same output as in str_which with OP's code 返回与使用OP的代码在str_which相同的输出

Regarding the vectorization issue with str_detect , it is, but here the length of the 'email' and 'searchvector' is different. 关于vectorization与问题str_detect ,它是,但这里的length的“电子邮件”和“searchvector”是不同的。 So, there would be a recycling issue 所以会有回收问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 R 中保存 str_which 循环的输出 - Saving the output of a str_which loop in R 在 tidyverse 中使用 grep 或 str_which 创建多个新变量 - creating multiple new variables from means using grep or str_which in tidyverse Stringr str_which 首先比较第一行与整列而不是下一行 - Stringr str_which first compare 1st row with whole column than to next row 如何 select dataframe `df1` 的连续行,其值与向量 `A` 中的值匹配 - How to select CONSECUTIVE rows of a dataframe `df1` which values match values from a vector `A` 过滤包含特定字符串的行 - Filter rows which contain a certain string r - 过滤包含来自向量的字符串的行 - r - Filter rows that contain a string from a vector 使用字符向量构建一个新的数据框,其中包含从该向量中找到字符串的行 - Use a character vector to build a new dataframe containing rows in which strings from that vector are found 如何使用 num_range 选择在一个特定列中都包含相同前 4 位数字的行? (希望使用 dplyr/tidyverse) - How do I use num_range to select rows which all contain the same first 4 digits in one specific column? (hoping to use dplyr/tidyverse) 从包含0的向量中的对象中消除0 - eliminating 0s from objects in a vector which contain 0s 使用dplyr过滤包含部分字符串列的行 - Using dplyr to filter rows which contain partial string of column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM