[英]dplyr: How do i calculate fold-change within group based on values in other column
My current data roughly has the following pattern:我目前的数据大致有以下模式:
Tree Fertilized Region Fruits
apple lightly sunny 100
apple lightly dark 50
apple heavily sunny 300
apple heavily dark 200
pear lightly sunny 150
pear lightly dark 200
pear heavily sunny 300
pear heavily dark 150
Here I want to calculate (as part of a bigger function) the fold-change of placing the tree in a sunny place compared to a dark one within each combination of fertilization amount and type of tree(eg a 2-fold change for lightly fertilized apple trees):在这里,我想计算(作为更大函数的一部分)在施肥量和树木类型的每种组合中,将树放在阳光充足的地方与黑暗的地方相比的倍数变化(例如,轻度施肥的 2 倍变化苹果树):
df%<>%
group_by(Tree,Fertilized) %>%
summarise(!!paste0("fold_change_", quote(Fruits)) := .[Region == "sunny","Fruits"]/.[type == "dark","Fruits"])
However, I get an error saying that the "Fruits" column doesn't exist.但是,我收到一条错误消息,指出“水果”列不存在。 Does anyone have a suggestion on how to get this working?
有没有人对如何让它工作有建议? I guess the solution is some minor syntax tweak but I can"t seem to find it myself or online.
我想解决方案是一些小的语法调整,但我似乎无法自己或在网上找到它。
The actual dataset has many more tree types and parameters like "Fruits", hence I picked the pipe structure and dynamic labelling of columns (",:paste0()", ".="), which may be relevant or irrelevant for solving this issue.实际数据集有更多的树类型和参数,如“水果”,因此我选择了 pipe 结构和列的动态标签(“,:paste0()”,“.=”),这可能与解决此问题相关或无关问题。
Thanks in advance to anyone trying to help!在此先感谢任何试图提供帮助的人!
Cheers, Rob干杯,罗伯
I would use a group-by operation:我会使用分组操作:
library(data.table)
library(dplyr)
f <- tempfile()
writeLines("
Tree, Fertilized, Region, Fruits,
apple, lightly, sunny, 100,
apple, lightly, dark, 50,
apple, heavily, sunny, 300,
apple, heavily, dark, 200,
pear, lightly, sunny, 150,
pear, lightly, dark, 200,
pear, heavily, sunny, 300,
pear, heavily, dark, 150
", f)
dat <- read.csv(f)
dat <- data.table(dat)
dat[order(Region), .(fold_change = Fruits[2] / Fruits[1]), by=.(Tree, Fertilized)]
#> Tree Fertilized fold_change
#> 1: apple lightly 2.00
#> 2: apple heavily 1.50
#> 3: pear lightly 0.75
#> 4: pear heavily 2.00
dat %>%
arrange(Region) %>%
group_by(Tree, Fertilized) %>%
summarize(fold_change = Fruits[2] / Fruits[1])
#> `summarise()` regrouping output by 'Tree' (override with `.groups` argument)
#> # A tibble: 4 x 3
#> # Groups: Tree [2]
#> Tree Fertilized fold_change
#> <chr> <chr> <dbl>
#> 1 apple " heavily" 1.5
#> 2 apple " lightly" 2
#> 3 pear " heavily" 2
#> 4 pear " lightly" 0.75
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.