如何根据 R 中另一列的特定条件对一列的值求和？

Question

I have a dataset that looks something like this:我有一个看起来像这样的数据集：

df <- data.frame(plot = c("A","A","A","A","A","B","B","B","B","B","C","C","C","C","C"),
                 species = c("Fagus","Fagus","Quercus","Picea", "Abies","Fagus","Fagus","Quercus","Picea", "Abies","Fagus","Fagus","Quercus","Picea", "Abies"),
                 value =  sample(100, size = 15, replace = TRUE)) 

head(df)
  plot species value
1    A   Fagus    53
2    A   Fagus    48
3    A Quercus     5
4    A   Picea    25
5    A   Abies    12
6    B   Fagus    12

Now, I want to create a new data frame containing per plot values for share.conifers and share.broadleaves by basically summing the values with conditions applied for species .现在，我想创建一个新的数据框，其中包含share.conifers和share.broadleaves的每个plot值，方法是将values与适用于species的条件相加。 I thought about using case_when but I am not sure how to write the syntax:我考虑过使用case_when但我不确定如何编写语法：

df1 <- df %>% share.broadleaves = case_when(plot = plot & species = "Fagus" or species = "Quercus" ~ FUN="sum")

df1 <- df %>% share.conifers = case_when(plot = plot & species = "Abies" or species = "Picea" ~ FUN="sum")

I know this is not right, but I would like something like this.我知道这是不对的，但我想要这样的东西。

Answer 1

Using dplyr / tidyr :使用dplyr / tidyr ：

First construct the group, do the calculation and then spread into columns.首先构建组，进行计算，然后散布到列中。

library(dplyr)
library(tidyr)

df |>
  mutate(type = case_when(species %in% c("Fagus", "Quercus") ~ "broadleaves",
                          species %in% c("Abies", "Picea") ~ "conifers")) |>
  group_by(plot, type) |>
  summarise(share = sum(value)) |>
  ungroup() |>
  pivot_wider(values_from = "share", names_from = "type", names_prefix = "share.")

Output: Output：

# A tibble: 3 × 3
  plot  share.broadleaves share.conifers
  <chr>             <int>          <int>
1 A                   159             77
2 B                    53             42
3 C                   204             63

I am not sure if you want to sum or get the share, but the code could easily be adapted to whatever goal you have.我不确定你是想求和还是分享，但代码可以很容易地适应你的任何目标。

Answer 2

One way could just be summarizing by plot and species :一种方法可能只是通过plot和species进行总结：

library(dplyr)
df |>
  group_by(plot, species) |>
  summarize(share = sum(value))

If you really want to get the share of a specific species per plot you could also do:如果你真的想根据 plot 获得特定物种的份额，你也可以这样做：

df |>
  group_by(plot) |>
  summarize(share_certain_species = sum(value[species %in% c("Fagus", "Quercus")]) / sum(value))

which gives:这使：

# A tibble: 3 × 2
  plot  share_certain_species
  <chr>                 <dbl>
1 A                     0.546
2 B                     0.583
3 C                     0.480

如何根据 R 中另一列的特定条件对一列的值求和？

问题描述

2 个解决方案

解决方案1
1 已采纳 2022-12-05 11:58:50

解决方案2
0 2022-12-05 10:29:22

如何根据 R 中另一列的特定条件对一列的值求和？

问题描述

2 个解决方案

解决方案1 1 已采纳 2022-12-05 11:58:50

解决方案2 0 2022-12-05 10:29:22

解决方案1
1 已采纳 2022-12-05 11:58:50

解决方案2
0 2022-12-05 10:29:22