简体   繁体   English

查找数据框中的行,其中一列中的文本可以在 R 中的另一列中找到

[英]Find rows in a data frame where the text in one column can be found in another column, in R

I want to identify rows in a data frame where the text in one column can be found in another column.我想识别数据框中的行,其中一列中的文本可以在另一列中找到。 For example, in the data frame below, I would like to identify the rows in which the model column contains the text in the gear column (in this case, rows 1, 2, 7, 8, 32).例如,在下面的数据框中,我想确定模型列包含齿轮列中文本的行(在本例中为第 1、2、7、8、32 行)。

mydf <- cbind.data.frame(model=rownames(mtcars), gear=as.character(mtcars$gear), stringsAsFactors=F)
mydf

                 model gear
1            Mazda RX4    4
2        Mazda RX4 Wag    4
3           Datsun 710    4
4       Hornet 4 Drive    3
5    Hornet Sportabout    3
6              Valiant    3
7           Duster 360    3
8            Merc 240D    4
9             Merc 230    4
10            Merc 280    4
11           Merc 280C    4
12          Merc 450SE    3
13          Merc 450SL    3
14         Merc 450SLC    3
15  Cadillac Fleetwood    3
16 Lincoln Continental    3
17   Chrysler Imperial    3
18            Fiat 128    4
19         Honda Civic    4
20      Toyota Corolla    4
21       Toyota Corona    3
22    Dodge Challenger    3
23         AMC Javelin    3
24          Camaro Z28    3
25    Pontiac Firebird    3
26           Fiat X1-9    4
27       Porsche 914-2    5
28        Lotus Europa    5
29      Ford Pantera L    5
30        Ferrari Dino    5
31       Maserati Bora    5
32          Volvo 142E    4

It seems like I should be able to use something like grep or match in combination with something like apply or map, or even ifelse, but I can't quite figure it out.似乎我应该能够将 grep 或 match 之类的东西与 apply 或 map 之类的东西结合使用,甚至 ifelse 之类的东西,但我不太明白。 (I could of course do a for loop but I have several million rows of data and would prefer not to.) (我当然可以做一个 for 循环,但我有几百万行数据,不想这样做。)

Try this:尝试这个:

mydf$flag = apply(mydf,1, function(x){grepl(x["gear"],x["model"])})

This will result:这将导致:

> head(mydf,20)
                 model gear  flag
1            Mazda RX4    4  TRUE
2        Mazda RX4 Wag    4  TRUE
3           Datsun 710    4 FALSE
4       Hornet 4 Drive    3 FALSE
5    Hornet Sportabout    3 FALSE
6              Valiant    3 FALSE
7           Duster 360    3  TRUE
8            Merc 240D    4  TRUE
9             Merc 230    4 FALSE
10            Merc 280    4 FALSE
11           Merc 280C    4 FALSE
12          Merc 450SE    3 FALSE
13          Merc 450SL    3 FALSE
14         Merc 450SLC    3 FALSE
15  Cadillac Fleetwood    3 FALSE
16 Lincoln Continental    3 FALSE
17   Chrysler Imperial    3 FALSE
18            Fiat 128    4 FALSE
19         Honda Civic    4 FALSE
20      Toyota Corolla    4 FALSE

stringr , part of tidyverse , has a vectorized implementation of grepl : stringr的一部分, tidyverse ,具有矢量执行grepl

library(tidyverse)
mydf %>% mutate(flag = str_detect(model,gear)) %>% head
#               model gear  flag
# 1         Mazda RX4    4  TRUE
# 2     Mazda RX4 Wag    4  TRUE
# 3        Datsun 710    4 FALSE
# 4    Hornet 4 Drive    3 FALSE
# 5 Hornet Sportabout    3 FALSE
# 6           Valiant    3 FALSE

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 有效地找到一个数据框中的任何列与另一个数据框中的任何列匹配的位置 - Efficiently find where any column in one data frame matches any column in another data frame R - 删除在一个列中找到两次值的行,一次用于另一列中的不同值 - R - Remove rows where the value in one column is found twice, once each for different values in another column 对于data.frame中的每一列,请找到行,其中column是唯一具有正值的行 - For each column in a data.frame find rows where column is the only one to have positive value R:从一个数据框中提取行,基于列名匹配另一个数据框中的值 - R: Extract Rows from One Data Frame, Based on Column Names Matching Values from Another Data Frame r-根据一个固定的文本将单列数据帧转换为带有行的数据帧 - r - convert single column data frame to data frame with rows based on one fixed text 如果一列值基于 R 数据帧中的另一列匹配,则过滤行 - filter rows if one column values matches based on another column in R data frame 使用 R 中的 dplyr 查找一列字符串在另一列中的行 - Find rows where one column string is in another column using dplyr in R 是否有更好的方法来找到满足R中数据帧另一列中每个值的条件的一列的百分比? - Is there a better way to find the percent of one column that meets a criteria for each value in another column for a data frame in R? R data.table删除如果另一列不适用的情况下重复一列的行 - R data.table remove rows where one column is duplicated if another column is NA R:根据另一列操作一个数据框列的值 - R: Manipulate values of one data frame column based on another column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM