查找数据框中的行，其中一列中的文本可以在 R 中的另一列中找到

Question

I want to identify rows in a data frame where the text in one column can be found in another column.我想识别数据框中的行，其中一列中的文本可以在另一列中找到。 For example, in the data frame below, I would like to identify the rows in which the model column contains the text in the gear column (in this case, rows 1, 2, 7, 8, 32).例如，在下面的数据框中，我想确定模型列包含齿轮列中文本的行（在本例中为第 1、2、7、8、32 行）。

mydf <- cbind.data.frame(model=rownames(mtcars), gear=as.character(mtcars$gear), stringsAsFactors=F)
mydf

                 model gear
1            Mazda RX4    4
2        Mazda RX4 Wag    4
3           Datsun 710    4
4       Hornet 4 Drive    3
5    Hornet Sportabout    3
6              Valiant    3
7           Duster 360    3
8            Merc 240D    4
9             Merc 230    4
10            Merc 280    4
11           Merc 280C    4
12          Merc 450SE    3
13          Merc 450SL    3
14         Merc 450SLC    3
15  Cadillac Fleetwood    3
16 Lincoln Continental    3
17   Chrysler Imperial    3
18            Fiat 128    4
19         Honda Civic    4
20      Toyota Corolla    4
21       Toyota Corona    3
22    Dodge Challenger    3
23         AMC Javelin    3
24          Camaro Z28    3
25    Pontiac Firebird    3
26           Fiat X1-9    4
27       Porsche 914-2    5
28        Lotus Europa    5
29      Ford Pantera L    5
30        Ferrari Dino    5
31       Maserati Bora    5
32          Volvo 142E    4

It seems like I should be able to use something like grep or match in combination with something like apply or map, or even ifelse, but I can't quite figure it out.似乎我应该能够将 grep 或 match 之类的东西与 apply 或 map 之类的东西结合使用，甚至 ifelse 之类的东西，但我不太明白。 (I could of course do a for loop but I have several million rows of data and would prefer not to.) （我当然可以做一个 for 循环，但我有几百万行数据，不想这样做。）

Answer 1

Try this:尝试这个：

mydf$flag = apply(mydf,1, function(x){grepl(x["gear"],x["model"])})

This will result:这将导致：

> head(mydf,20)
                 model gear  flag
1            Mazda RX4    4  TRUE
2        Mazda RX4 Wag    4  TRUE
3           Datsun 710    4 FALSE
4       Hornet 4 Drive    3 FALSE
5    Hornet Sportabout    3 FALSE
6              Valiant    3 FALSE
7           Duster 360    3  TRUE
8            Merc 240D    4  TRUE
9             Merc 230    4 FALSE
10            Merc 280    4 FALSE
11           Merc 280C    4 FALSE
12          Merc 450SE    3 FALSE
13          Merc 450SL    3 FALSE
14         Merc 450SLC    3 FALSE
15  Cadillac Fleetwood    3 FALSE
16 Lincoln Continental    3 FALSE
17   Chrysler Imperial    3 FALSE
18            Fiat 128    4 FALSE
19         Honda Civic    4 FALSE
20      Toyota Corolla    4 FALSE

Answer 2

stringr , part of tidyverse , has a vectorized implementation of grepl : stringr的一部分， tidyverse ，具有矢量执行grepl ：

library(tidyverse)
mydf %>% mutate(flag = str_detect(model,gear)) %>% head
#               model gear  flag
# 1         Mazda RX4    4  TRUE
# 2     Mazda RX4 Wag    4  TRUE
# 3        Datsun 710    4 FALSE
# 4    Hornet 4 Drive    3 FALSE
# 5 Hornet Sportabout    3 FALSE
# 6           Valiant    3 FALSE

查找数据框中的行，其中一列中的文本可以在 R 中的另一列中找到

问题描述

2 个解决方案

解决方案1
1 已采纳 2018-07-20 20:37:47

解决方案2
1 2018-07-20 22:43:44

查找数据框中的行，其中一列中的文本可以在 R 中的另一列中找到

问题描述

2 个解决方案

解决方案1 1 已采纳 2018-07-20 20:37:47

解决方案2 1 2018-07-20 22:43:44

解决方案1
1 已采纳 2018-07-20 20:37:47

解决方案2
1 2018-07-20 22:43:44