R 条件 rowSums 替换为基于百分比的总和

Question

I'm looking to conditionally rowSums if those rows represent <1% of the data - and then replace the original values with the rowSums.如果这些行代表<1％的数据，我正在寻找有条件的rowSums - 然后用rowSums替换原始值。 *Bonus if the table could include the number of rows that were summed into the name column (eg, "Other(n=2)"). *如果表可以包括汇总到名称列中的行数（例如，“其他（n = 2）”），则奖励。 This is a small part of a much larger function.这是更大的 function 的一小部分。 See example below:请参见下面的示例：

Example data:示例数据：

name姓名	Year1第一年	Year2第二年	Year3第 3 年	Total全部的	Percent百分
John约翰	1 1	2 2	1 1	4 4	0.7029877 0.7029877
Paul保罗	230 230	100 100	150 150	480 480	84.358524 84.358524
George乔治	41 41	30 30	10 10	81 81	14.235501 14.235501
Ringo林戈	2 2	1 1	1 1	4 4	0.7029877 0.7029877

# Code for example data
name <- c("John", "Paul", "George", "Ringo")
Year1 <- c(1, 230, 41, 2)
Year2 <- c(2, 100, 30, 1)
Year3 <- c(1, 150, 10, 1)
df <- data.frame(name, Year1, Year2, Year3)
df$Total <- rowSums(select(df,Year1:Year3))
df$Percent <- df$Total/sum(df$Total)*100

In the solution, John and Ringo would be combined into one 'Other' solution since both have Percent < 1.在解决方案中，John 和 Ringo 将合并为一个“其他”解决方案，因为两者的百分比 < 1。

# Code for example solution
name <- c("Paul", "George", "Other(n=2)")
Year1 <- c(230, 41, 3)
Year2 <- c(100, 30, 3)
Year3 <- c(150, 10, 2)
df2 <- data.frame(name, Year1, Year2, Year3)
df2$Total <- rowSums(select(df2,Year1:Year3))
df2$Percent <- df2$Total/sum(df2$Total)*100

Example solution:示例解决方案：

name姓名	Year1第一年	Year2第二年	Year3第 3 年	Total全部的	Percent百分
Paul保罗	230 230	100 100	150 150	480 480	84.358524 84.358524
George乔治	41 41	30 30	10 10	81 81	14.235501 14.235501
Other(n=2)其他(n=2)	3 3	3 3	2 2	8 8	1.405975 1.405975

Answer 1

library(tidyverse) # or use forcats::fct_lump(...
df %>% 
  mutate(name_lumped = fct_lump(name, w = Percent, prop = 0.01)) %>%
  group_by(name_lumped) %>%
  summarize(across(Year1:Percent, sum))

# A tibble: 3 x 6
  name_lumped Year1 Year2 Year3 Total Percent
  <fct>       <dbl> <dbl> <dbl> <dbl>   <dbl>
1 George         41    30    10    81   14.2 
2 Paul          230   100   150   480   84.4 
3 Other           3     3     2     8    1.41

R 条件 rowSums 替换为基于百分比的总和

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-02-18 00:15:39

R 条件 rowSums 替换为基于百分比的总和

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-02-18 00:15:39

解决方案1
1 已采纳 2021-02-18 00:15:39