如何根据R中另一个相同的行值对行百分比求和？

Question

My dataset is composed of whale calls. 我的数据集由鲸鱼调用组成。 I have two variables: nclicks and percent. 我有两个变量：nclicks和百分比。 nclicks refers to the number of clicks in the call (ranges from 3 to 30). nclicks是指呼叫中的点击次数（范围从3到30）。 Percent refers to the frequency with which that type of call was made in a given year. 百分比是指在给定年份中进行此类呼叫的频率。 I would like to sum the percentages for longer calls (those with 11+ clicks) and add a new row to the dataframe that has 11+ for nclicks and the summed percentage for percent. 我想总结更长调用的百分比（点击次数超过11次），并在数据框中添加一个新行，其中nclicks为11+，百分比的总和百分比。 I then want to delete the rows that made up the new row. 然后我想删除构成新行的行。

I've tried coding "nclicks" as both a factor and a numeric. 我已经尝试将“nclicks”编码为因子和数字。 I've used combinations of aggregate, rowSums, rbind, etc. but with no luck. 我已经使用了aggregate，rowSums，rbind等的组合，但没有运气。 The closest I've come was getting a new row that had the summed percentages, but I had to specify which rows to include manually (see example below). 我最接近的是获得一个具有总和百分比的新行，但我必须指定手动包含哪些行（参见下面的示例）。 This method also summed the nclicks values (so in my example below, I get a new row with 43 (11+12+20) in nclicks and 20 in percent, when I really want the row number to be 4, nclicks to be 11+, and percent to be 20). 这个方法也总结了nclicks值（所以在下面的例子中，我在nclicks中得到一个新的行，其中43（11 + 12 + 20）和20％的行，当我真的想要行号为4时，nclicks为11 +，百分比为20）。

nclicks=c(3,4,5,11,12,20) 

percent=c(30,30,20,10,5,5) 

df=data.frame(cbind(nclicks,percent)) 

df["11+",]=df["4",]+df["5",]+df["6",] 

df=df[-c(4,5,6), ] 

df

This is what I end up with: 这就是我最终的结果：

 nclicks percent
1         3      30
2         4      30
3         5      20
11+      43      20

I want to sum the percentages of rows for which the value of nclicks is > 10, but I'm having trouble executing this. 我想总结nclicks值> 10的行的百分比，但是我在执行它时遇到了麻烦。 I don't want to have to individually specify which values of nclicks to include, because some years have many different nclick values > 10 while some years only have a few different values > 10. 我不想单独指定要包含哪些nclicks值，因为有些年份有许多不同的nclick值> 10，而有些年份只有几个不同的值> 10。

Answer 1

You can create a group column to help aggregate the rows where nclicks>=11 . 您可以创建group列以帮助聚合nclicks>=11的行。

library("tidyverse")

nclicks <- c(3, 4, 5, 11, 12, 20)
percent <- c(30, 30, 20, 10, 5, 5)

df <- tibble(nclicks, percent)
df <- df %>%
  mutate(group = ifelse(nclicks >= 11, "11+", nclicks)) %>%
  group_by(group) %>%
  summarise_at(vars(nclicks, percent), sum)
df
#> # A tibble: 4 x 3
#>   group nclicks percent
#>   <chr>   <dbl>   <dbl>
#> 1 11+        43      20
#> 2 3           3      30
#> 3 4           4      30
#> 4 5           5      20

^{Created on 2019-03-31 by the reprex package (v0.2.1)} ^{由reprex包创建于2019-03-31（v0.2.1）}

如何根据R中另一个相同的行值对行百分比求和？

问题描述

1 个解决方案

解决方案1
0 2019-03-30 23:30:43

如何根据R中另一个相同的行值对行百分比求和？

问题描述

1 个解决方案

解决方案1 0 2019-03-30 23:30:43

解决方案1
0 2019-03-30 23:30:43