简体   繁体   English

如何根据 R 中另一列的特定条件对一列的值求和?

[英]How to sum values from one column based on specific conditions from other column in R?

I have a dataset that looks something like this:我有一个看起来像这样的数据集:

df <- data.frame(plot = c("A","A","A","A","A","B","B","B","B","B","C","C","C","C","C"),
                 species = c("Fagus","Fagus","Quercus","Picea", "Abies","Fagus","Fagus","Quercus","Picea", "Abies","Fagus","Fagus","Quercus","Picea", "Abies"),
                 value =  sample(100, size = 15, replace = TRUE)) 

head(df)
  plot species value
1    A   Fagus    53
2    A   Fagus    48
3    A Quercus     5
4    A   Picea    25
5    A   Abies    12
6    B   Fagus    12

Now, I want to create a new data frame containing per plot values for share.conifers and share.broadleaves by basically summing the values with conditions applied for species .现在,我想创建一个新的数据框,其中包含share.conifersshare.broadleaves的每个plot值,方法是将values与适用于species的条件相加。 I thought about using case_when but I am not sure how to write the syntax:我考虑过使用case_when但我不确定如何编写语法:

df1 <- df %>% share.broadleaves = case_when(plot = plot & species = "Fagus" or species = "Quercus" ~ FUN="sum")

df1 <- df %>% share.conifers = case_when(plot = plot & species = "Abies" or species = "Picea" ~ FUN="sum")

I know this is not right, but I would like something like this.我知道这是不对的,但我想要这样的东西。

Using dplyr / tidyr :使用dplyr / tidyr

First construct the group, do the calculation and then spread into columns.首先构建组,进行计算,然后散布到列中。

library(dplyr)
library(tidyr)

df |>
  mutate(type = case_when(species %in% c("Fagus", "Quercus") ~ "broadleaves",
                          species %in% c("Abies", "Picea") ~ "conifers")) |>
  group_by(plot, type) |>
  summarise(share = sum(value)) |>
  ungroup() |>
  pivot_wider(values_from = "share", names_from = "type", names_prefix = "share.")

Output: Output:

# A tibble: 3 × 3
  plot  share.broadleaves share.conifers
  <chr>             <int>          <int>
1 A                   159             77
2 B                    53             42
3 C                   204             63

I am not sure if you want to sum or get the share, but the code could easily be adapted to whatever goal you have.我不确定你是想求和还是分享,但代码可以很容易地适应你的任何目标。

One way could just be summarizing by plot and species :一种方法可能只是通过plotspecies进行总结:

library(dplyr)
df |>
  group_by(plot, species) |>
  summarize(share = sum(value))

If you really want to get the share of a specific species per plot you could also do:如果你真的想根据 plot 获得特定物种的份额,你也可以这样做:

df |>
  group_by(plot) |>
  summarize(share_certain_species = sum(value[species %in% c("Fagus", "Quercus")]) / sum(value))

which gives:这使:

# A tibble: 3 × 2
  plot  share_certain_species
  <chr>                 <dbl>
1 A                     0.546
2 B                     0.583
3 C                     0.480

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据 R 中分类列的条件对金额列求和 - How to Sum Amounts Column Based On Conditions from Categorical Column in R 如何根据R中另一列的条件求和一列的特定单元格? - how to sum up specific cells of a column based on conditions from another column in R? 如何根据其他列R中的值对一列中的值求和? - How to sum values in one column based on values in other columns R? 如何根据 R 中的 NULL 值从一列或另一列 select? - How to select from one column or the other based on NULL values in R? 在 R 中,如何根据两个条件对一列的值求和,并按另一列值分组? - How to sum values of one column, based on two conditions, grouped by another column value, in R? 如何根据一列中的值对数据进行装箱,并汇总R中另一列中的出现次数? - How to bin data based on values in one column, and sum occurrences from another column in R? R-如何根据数据帧中其他列的总和来最大化 - R - How to maximise based of sum of other column from data frame 如何根据 R 中的其他列对列中的值求和 - How to sum values in a column based on other column(s) in R 如何根据R中的一列列表将一个数据框中的值汇总到另一个数据框中 - How to sum values from one data frame into another based on a column of lists in R 根据 R 中的条件更改特定列值 - Changing specific column values based on conditions in R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM