繁体   English   中英

R添加一列,其中单元格值基于不同行中的值

[英]R add column where cell values based on values in a different row

我有一个data.frame ,其中每一行表示是否在特定位置发现了动物。

我想在此示例data.frame中创建一个标记为"prey"的新列。 该值将为1或0,具体取决于在同一位置(每个位置都有唯一的ID )发现掠食者的猎物。

问题在于每只动物都有单独的一行,因此有关捕食者存在的信息与捕食者不在同一行。 两种掠食者是狮子和猎豹。

对于此示例,狮子的猎物是羚羊和斑马,因此:

  • 对于ID 1,由于在该位置发现了羚羊和狮子,因此猎物栏中​​的狮子行应为1。
  • 对于ID 2,没有发现羚羊或斑马,因此狮子行的猎物列为0。

猎豹的猎物是羚羊,瞪羚,黑斑羚。

下面是示例data.frame ,我想出的解决方案效率很低,我正在寻找更快/更整洁的东西。

 df <- data.frame(ID=c(1,1,1,1,1,1, 2, 2, 2, 2, 2, 2),
             species=c("lion", "antelope", "zebra", "cheetah", "impala", "gazelles", "lion", "antelope", "zebra", "cheetah", "impala", "gazelles"),
             present=c(1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1),
             stringsAsFactors=FALSE)

k=list(list())
for (i in 1:2) { ### for loop ofr 2 unique IDs
k[[i]]=df[which(df$ID == unique(df$ID[i])),]
k[[i]]$antelope=0
k[[i]]$zebra=0
k[[i]]$impala=0
k[[i]]$gazelle=0
k[[i]]$lionprey=0
k[[i]]$cheetahprey=0

k[[i]]$antelope[[1]]=ifelse(k[[i]]$pres[[2]]==1, 1, 0)
k[[i]]$zebra[[1]]=ifelse(k[[i]]$pres[[3]]==1, 1, 0)
k[[i]]$lionprey[[1]]=ifelse (k[[i]]$antelope[[1]] == 1 || 
k[[i]]$zebra[[1]] == 1, 1, 0) 

k[[i]]$antelope[[4]]=ifelse(k[[i]]$pres[[2]]==1, 1, 0)
k[[i]]$gazelle[[4]]=ifelse(k[[i]]$pres[[6]]==1, 1, 0)
k[[i]]$impala[[4]]=ifelse(k[[i]]$pres[[5]]==1, 1, 0)
k[[i]]$cheetahprey[[4]]= ifelse(k[[i]]$antelope[[4]] == 1 || 
k[[i]]$gazelle[[4]] == 1 || k[[i]]$impala[[4]]==1, 1, 0) 

}

k=do.call("rbind", k)
k$antelope=NULL
k$zebra=NULL
k$impala=NULL
k$gazelle=NULL
k$prey=k$lionprey+k$cheetahprey
k$lionprey=NULL
k$cheetahprey=NULL

考虑使用tidyr::spread简化第一个数据帧的结构。

df <- df %>% spread(species, present)

#>   ID antelope cheetah gazelles impala lion zebra
#>1   1        1       1        0      1    1     0
#>2   2        0       1        1      1    1     0

然后继续dplyr

df %>% 
  spread(species, present) %>%
  mutate(lion_prey = case_when(antelope == 1 | zebra == 1 ~ 1,
                               TRUE ~ 0),
         cheetah_prey = case_when(antelope == 1 | gazelles == 1  | impala == 1 ~ 1,
                               TRUE ~ 0)) %>%
  gather(species, present, -ID, -lion_prey, -cheetah_prey) %>%
  mutate(prey = case_when(species == "lion" ~ lion_prey,
                          species == "cheetah" ~ cheetah_prey,
                          TRUE ~ 0)) %>%
  select(-lion_prey, -cheetah_prey)

#>       ID  species present prey
#>    1   1 antelope       1    0
#>    2   2 antelope       0    0
#>    3   1  cheetah       1    1
#>    4   2  cheetah       1    1
#>    5   1 gazelles       0    0
#>    6   2 gazelles       1    0
#>    7   1   impala       1    0
#>    8   2   impala       1    0
#>    9   1     lion       1    1
#>    10  2     lion       1    0
#>    11  1    zebra       0    0
#>    12  2    zebra       0    0

出于您描述的原因,这涉及到一些混乱的逻辑表达式,但这是一种实现方法。 这具有可推广的优点。 如果要添加捕食者,只需将它们添加到predators然后将其猎物添加到predators_prey predators_prey是一个列表,用于容纳具有不同猎物数量的捕食者(如此处所示):

# define the predators
predators <- c("lion", "cheetah")

# create a list of their prey from which to programmatically extract
predators_prey <- list(lion = c("antelope", "zebra"), cheetah = c("antelope", "gazelles", "impala"))

# initialize the $prey column
df$prey <- 0

# use for loop because we're assigning a value in global env
for (predator in predators ){
  for (ID in unique(df$ID)){

    # is the predator here?
    predator_here = df[df$ID == ID & df$species == predator,]$present
    # is that predator's prey here?
    prey_here = any(df[df$ID == ID & df$present == 1,]$species %in% predators_prey[[predator]])

    # if both, then set $prey to 1
    if(predator_here & prey_here){
      df[df$ID == ID & df$species == predator,]$prey <- 1
    }
  }
}
# lets look at the result
df
#    ID  species present prey
# 1   1     lion       1    1
# 2   1 antelope       1    0
# 3   1    zebra       0    0
# 4   1  cheetah       1    1
# 5   1   impala       1    0
# 6   1 gazelles       0    0
# 7   2     lion       1    0
# 8   2 antelope       0    0
# 9   2    zebra       0    0
# 10  2  cheetah       1    1
# 11  2   impala       1    0
# 12  2 gazelles       1    0

数据:

df <- data.frame(ID=c(1,1,1,1,1,1, 2, 2, 2, 2, 2, 2),
                 species=c("lion", "antelope", "zebra", "cheetah", "impala", "gazelles", "lion", "antelope", "zebra", "cheetah", "impala", "gazelles"),
                 present=c(1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1),
                 stringsAsFactors=FALSE)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM