简体   繁体   English

R - 根据条件组合行以获得平均值/平均值

[英]R - Combine rows to get average/mean based on conditions

Hei, I have a question regarding merging rows to get the average of a column, based on conditions of column value in R.嘿,我有一个关于合并行以获得列的平均值的问题,基于 R 中的列值条件。

I would like to merge 2 rows in a data frame to get the average value of a column based on conditions on other column: For instance (see example of data set below) when the columns: depth == 20 & Species == "Diatoms" & locationID =="A", I would like to get the average value of the column quantity, add this value into 1 of the 2 row and delete the other.我想合并数据框中的 2 行,以根据其他列的条件获取列的平均值:例如(请参见下面的数据集示例)当列:depth == 20 & Species == "Diatoms " & locationID =="A",我想得到列数量的平均值,将此值添加到2行的1中并删除另一个。

 structure(list(depth = c(20, 20, 2, 4, 10), Species = c("Diatoms", 
"Diatoms", "Dinoflagellates", "Dinoflagellates", "Ciliates"), 
    locationID = c("A", "A", "B", "C", "A"), quantity = c(2, 
    4, 1, 2, 5)), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))```

Since it looks like your data frame is set up as a tibble, I'm guessing you are using dplyr.由于看起来您的数据框设置为 tibble,我猜您正在使用 dplyr。 In that case, you should be able to use a combination of group_by() and summarize() to do what you are trying to do.在这种情况下,您应该能够使用group_by()summarize()的组合来执行您正在尝试执行的操作。 Are you looking for average quantities for each species, location and pair, or just one?您是在寻找每个物种、位置和配对的平均数量,还是只寻找一个?

example using mtcars data set:使用 mtcars 数据集的示例:

library(dplyr)

mtcars %>%
      group_by(gear,  cyl,carb)%>%
      summarize(hp.mean = mean(hp))

# # A tibble: 12 x 4
# # Groups:   gear, cyl [8]
# gear   cyl  carb hp.mean
# <dbl> <dbl> <dbl>   <dbl>
# 1     3     4     1    97  
# 2     3     6     1   108. 
# 3     3     8     2   162. 
# 4     3     8     3   180  
# 5     3     8     4   228  
# 6     4     4     1    72.5
# 7     4     4     2    79.5
# 8     4     6     4   116. 

This approach merges all rows with matching gear, cyl and carb and averages (mean in this case) the hp for all of the matching rows.这种方法将所有行与匹配的齿轮、cyl 和 carb 合并,并平均(在这种情况下)所有匹配行的 hp。

If on the other hand you just want the mean of one case, you can subset or filter then take the mean.另一方面,如果您只想要一个案例的平均值,您可以子集或过滤然后取平均值。

mtcars %>%
      filter(
            gear == 3,
            cyl == 8, 
            carb == 3
      ) %>%
      pull(hp) %>%
      mean()
# [1] 180

# base approach to return single answer
mean(mtcars$hp[mtcars$gear == 3 &
               mtcars$cyl == 8 &
               mtcars$carb == 3
               ])
# [1] 180

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM