简体   繁体   English

创建一个其值依赖于其他多个列的列

[英]Creating a column whose values are dependent on multiple other columns

I'm trying to create a new column ("newcol") in a dataframe ("data"), whose values will be determined by the contents of up to two other columns in the dataframe ("B_stance" and "C_stance"). 我正在尝试在数据帧(“ data”)中创建一个新列(“ newcol”),其值将由该数据帧中其他两个列(“ B_stance”和“ C_stance”)的内容决定。 The values within B_stance are either "L", "R", "U" or "N". B_stance中的值是“ L”,“ R”,“ U”或“ N”。 Within C_stance they are either "L" or "R". 在C_stance中,它们是“ L”或“ R”。

Please excuse the semi-logical language, but I need R code which will achieve this for the contents of newcol: 请原谅半逻辑语言,但是我需要R代码,它将为newcol的内容实现这一点:

if (data$B_stance = "L" AND data$C_stance = "L") then (data$newcol = "N")
if (data$B_stance = "L" AND data$C_stance = "R") then (data$newcol = "Y")
if (data$B_stance = "R" AND data$C_stance = "R") then (data$newcol = "N")
if (data$B_stance = "R" AND data$C_stance = "L") then (data$newcol = "Y")
if (data$B_stance = "U") then (data$newcol = "N")
if (data$B_stance = "N") then (data$newcol = "N")

I've tried to see if/how "ifelse" could achieve this, but cannot find an example of how to draw from multiple column values in determining the new value. 我尝试查看“ ifelse”是否可以/如何实现此目的,但是找不到如何在确定新值时从多个列值中提取的示例。

It may be easier to create a key/val dataset and then do a join 创建key/val数据集然后进行key/val可能会更容易

keydat <- data.frame(B_stance = c('L', 'L', 'R', 'R'),
                      C_stance = c('L', 'R', 'R', 'L'),
                       newcol = c('N', 'Y', 'N', 'Y'),
                stringsAsFactors = FALSE)
library(dplyr)
left_join(data, keydat) %>%
           mutate(newcol = replace(newcol, is.na(newcol), 'N'))

In base R the ifelse function is most useful for these conditions. 在基数R中, ifelse函数对于这些条件最有用。 The dplyr library includesa more robust if_else function and a case_when function. dplyr库包括一个更强大的if_else函数和case_when函数。 The ifelse returns the second argument if the first is true and returns the third argument if the first argument is false. ifelse如果第一个参数为true,则返回第二个参数;如果第一个参数为false,则返回第三个参数。

data <- read.table(text="
B_stance C_stance
L R
L L
U X
R L
R R
N X
X X
", header= TRUE)


data$newcol = ifelse(data$B_stance == "L" & data$C_stance == "L", "N",
                     ifelse(data$B_stance == "L" & data$C_stance == "R", "Y",
                            ifelse(data$B_stance == "R" & data$C_stance == "R", "N",
                                   ifelse(data$B_stance == "R" & data$C_stance == "L", "Y",
                                          ifelse(data$B_stance == "U", "N",
                                                 ifelse(data$B_stance == "N", "N",
                                                        NA))))))

data

# B_stance C_stance newcol
# 1        L        R      Y
# 2        L        L      N
# 3        U        X      N
# 4        R        L      Y
# 5        R        R      N
# 6        N        X      N
# 7        X        X   <NA>

With dplyr you can use case_when . 使用dplyr可以使用case_when It's a little cleaner than nested if_else s if you have numerous conditions. 如果您有很多条件,它比嵌套的if_else干净一点。

df <- data.frame(
  B_stance = c('L', 'L', 'R', 'R'),
  C_stance = c('L', 'R', 'R', 'L'),
  stringsAsFactors = FALSE
)

df %>% mutate(
  newcol = case_when(
    B_stance == 'U'                   ~ 'N',
    B_stance == 'N'                   ~ 'N',
    B_stance == 'L' & C_stance == 'L' ~ 'N',
    B_stance == 'L' & C_stance == 'R' ~ 'Y',
    B_stance == 'R' & C_stance == 'L' ~ 'Y',
    B_stance == 'R' & C_stance == 'R' ~ 'N',
    TRUE                              ~ B_stance
  )
)

#   B_stance C_stance newcol
# 1        L        L      N
# 2        L        R      Y
# 3        R        R      N
# 4        R        L      Y

Note that the conditioning within case_when is lazy; 注意case_when中的条件是惰性的; the first true statement is executed. 第一条true语句被执行。 The final TRUE ensures there's a fallback in case no statement is true. 最后的TRUE确保在没有语句为true的情况下进行回退。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM