dplyr：我如何根据其他列中的值计算组内的倍数变化

Question

My current data roughly has the following pattern:我目前的数据大致有以下模式：

Tree   Fertilized   Region   Fruits

apple  lightly      sunny    100
apple  lightly      dark     50
apple  heavily      sunny    300
apple  heavily      dark     200
pear   lightly      sunny    150
pear   lightly      dark     200
pear   heavily      sunny    300
pear   heavily      dark     150

Here I want to calculate (as part of a bigger function) the fold-change of placing the tree in a sunny place compared to a dark one within each combination of fertilization amount and type of tree(eg a 2-fold change for lightly fertilized apple trees):在这里，我想计算（作为更大函数的一部分）在施肥量和树木类型的每种组合中，将树放在阳光充足的地方与黑暗的地方相比的倍数变化（例如，轻度施肥的 2 倍变化苹果树）：

df%<>%
  group_by(Tree,Fertilized) %>% 
  summarise(!!paste0("fold_change_", quote(Fruits)) := .[Region == "sunny","Fruits"]/.[type == "dark","Fruits"])

However, I get an error saying that the "Fruits" column doesn't exist.但是，我收到一条错误消息，指出“水果”列不存在。 Does anyone have a suggestion on how to get this working?有没有人对如何让它工作有建议？ I guess the solution is some minor syntax tweak but I can"t seem to find it myself or online.我想解决方案是一些小的语法调整，但我似乎无法自己或在网上找到它。

The actual dataset has many more tree types and parameters like "Fruits", hence I picked the pipe structure and dynamic labelling of columns (",:paste0()", ".="), which may be relevant or irrelevant for solving this issue.实际数据集有更多的树类型和参数，如“水果”，因此我选择了 pipe 结构和列的动态标签（“，：paste0（）”，“.=”），这可能与解决此问题相关或无关问题。

Thanks in advance to anyone trying to help!在此先感谢任何试图提供帮助的人！

Cheers, Rob干杯，罗伯

Answer 1

I would use a group-by operation:我会使用分组操作：

library(data.table)
library(dplyr)


f <- tempfile()
writeLines("
Tree,  Fertilized,  Region,  Fruits,
apple, lightly, sunny, 100,
apple, lightly, dark, 50,
apple, heavily, sunny, 300,
apple, heavily, dark, 200,
pear, lightly, sunny, 150,
pear, lightly, dark, 200,
pear, heavily, sunny, 300,
pear, heavily, dark, 150
", f)
dat <- read.csv(f)

data.table data.table

dat <- data.table(dat)

dat[order(Region), .(fold_change = Fruits[2] / Fruits[1]), by=.(Tree, Fertilized)]
#>     Tree Fertilized fold_change
#> 1: apple    lightly        2.00
#> 2: apple    heavily        1.50
#> 3:  pear    lightly        0.75
#> 4:  pear    heavily        2.00

tidyverse整洁宇宙

dat %>% 
  arrange(Region) %>%
  group_by(Tree, Fertilized)  %>%
        summarize(fold_change = Fruits[2] / Fruits[1])
#> `summarise()` regrouping output by 'Tree' (override with `.groups` argument)
#> # A tibble: 4 x 3
#> # Groups:   Tree [2]
#>   Tree  Fertilized fold_change
#>   <chr> <chr>            <dbl>
#> 1 apple " heavily"        1.5 
#> 2 apple " lightly"        2   
#> 3 pear  " heavily"        2   
#> 4 pear  " lightly"        0.75

dplyr：我如何根据其他列中的值计算组内的倍数变化

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-08-31 18:15:59

data.table data.table

tidyverse整洁宇宙

dplyr：我如何根据其他列中的值计算组内的倍数变化

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-08-31 18:15:59

data.table data.table

tidyverse整洁宇宙

解决方案1
1 已采纳 2020-08-31 18:15:59