简体   繁体   English

按行条目在另一个数据帧中的参考列

[英]Reference column in another data frame by row entry

I have DF1 like this: 我有这样的DF1:

ID      Name      Team
222717  Bob       Badgers
321817  James     Tigers
521917  Eric      Possums

And DF2 like this: DF2像这样:

Badgers    Tigers    Possums
222717     438283    521917
789423     978748    251233

I want to check if the ID in DF1 appears in the corresponding team name in DF2. 我想检查DF1中的ID是否出现在DF2中的相应团队名称中。 For example, in the first row, Bob's ID does appear under his team name, "Badgers," in DF2. 例如,在第一行中,Bob的ID确实出现在DF2的团队名称“ Badgers”下。 James' ID does not appear under his team name, "Tigers," in DF2. 在DF2中,James的ID没有出现在他的团队名称“ Tigers”下。 I was thinking of adding a column that marks whether it appears or not, but can't figure out how to reference the column in DF2. 我当时在考虑添加一列标记它是否出现,但无法弄清楚如何在DF2中引用该列。 Here's what I tried. 这是我尝试过的。

test <- mutate(DF1,validID=ifelse(ID%in%DF2$DF1$Team,"Yes",NA))

The DF2$DF1$Team part is where I'm stuck. DF2$DF1$Team部分是我遇到的问题。 How do I reference the column in DF2 that corresponds to the team listed in DF1? 如何引用DF2中与DF1中列出的小组相对应的列? Also open to alternative suggestions on how to manipulate the data to achieve this task. 还开放了关于如何处理数据以实现此任务的替代建议。

The %in% function is a compact way to access the match function. %in%函数是访问match函数的一种紧凑方式。 mapply is the canonical method to supply multiple columns for evaluation of their corresponding values in sequence. mapply是规范方法,可提供多个列以按顺序评估它们的对应值。

DF1$right2 <- mapply( function(a,b) {a %in% DF2[[b]]}, a=DF1$ID, b=as.character(DF1$Team) )
#============
> DF1
      ID  Name    Team right2
1 222717   Bob Badgers   TRUE
2 321817 James  Tigers  FALSE
3 521917  Eric Possums   TRUE

Honestly I find mapply hard to conceptualise, and in any case 42's answer seems to return FALSE for Eric, when it ought to return true. 老实说,我觉得mapply很难概念化,在任何情况下42的回答似乎埃里克返回FALSE,当它应该返回true。 Most likely a typo, but for future reference it's helpful to give your sample data in a format that lets you just copy the code and create the right objects! 最有可能是错字,但对于将来的参考,将示例数据提供给您的格式很有用,该格式允许您仅复制代码并创建正确的对象!

This is a quick way of doing it avoiding map or apply functions, with only tidyverse tools (and a magrittr alias, but you can sub that out). 这是一种仅使用tidyverse工具(和magrittr别名,但可以将其删除)来避免mapapply函数的快速方法。 Here I split the "finding the right column" and "checking if ID is there" into two steps, but you could combine if you wanted. 在这里,我将“查找右列”和“检查ID是否存在”分为两个步骤,但是如果需要,可以合并。

library(tidyverse)
library(magrittr)
df1 <- tibble(ID = c(222717, 321817, 521917),
              Name = c("Bob", "James", "Eric"),
              Team = c("Badgers", "Tigers", "Possums")
              )
df2 <- tibble(Badgers = c(222717, 789423),
              Tigers = c(438283, 978748),
              Possums = c(521917, 251233)
              )
df1 %>%
  mutate(team_col = colnames(df2) %>% equals(Team) %>% which()) %>%
  mutate(id_exists_for_team = ID %in% as_vector(df2[team_col]))
#> # A tibble: 3 x 5
#>       ID  Name    Team team_col id_exists_for_team
#>    <dbl> <chr>   <chr>    <int>              <lgl>
#> 1 222717   Bob Badgers        1               TRUE
#> 2 321817 James  Tigers        2              FALSE
#> 3 521917  Eric Possums        3               TRUE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM