簡體   English   中英

將行分組為新行並在r中求和

[英]Group rows into a new row and sum in r

所以我有這樣的數據:

 Week        Total Amount        Person
   1            $5                 A
   1            $5                 B
   1            $4                 C
   1            $2                 D
   1            $1                 E
   2            $5                 A
   2            $1                 B
   2            $1                 H
   2            $3                 G
   2            $5                 C
   2            $5                 F

我如何制作它以便每周顯示前三名並將所有其他金額加到“其他”中? 我想要它顯示:

 Week        Total Amount        Person
   1            $5                 A
   1            $5                 B
   1            $4                 C
   1            $3                 Others
   2            $5                 A
   2            $5                 C
   2            $5                 F
   2            $5                 Others

請注意,不是前三名的其他金額總計為新的總金額,它會計算每周的隨機行數(例如,第1周每個人有5個總金額,但第2周有6個星期和一周) 3可能是8或10,第4周可能是1總,但我希望方程適用於每一行)

這很容易使用tidyverse。 在名為df的數據框中說這個。

library(tidyverse)

df.new <- df %>%
  group_by(Week) %>%
  arrange(`Total Amount`) %>%
  mutate(Person = ifelse(row_number() > 3, "Others", Person)) %>%
  group_by(Week, Person) %>%
  summarize(`Total Amount` = sum(`Total Amount`))

如果列中有“$”(它是一個字符串列),您首先需要轉換它才能使用匯總行。 您可以使用parse_number()等函數來執行此操作。

基地R.

df$Person[ave(df$`Total Amount`, df$Week, FUN = function(x)
    order(x, decreasing = TRUE)) > 3] = "Others"
df2 = aggregate(df["Total Amount"], df[c("Week", "Person")], sum)
df2[order(df2$Week, df2$Person),]
#  Week Person Total Amount
#1    1      A            5
#3    1      B            5
#4    1      C            4
#7    1 Others            3
#2    2      A            5
#5    2      C            5
#6    2      F            5
#8    2 Others            5

數據

df = structure(list(Week = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
2L), `Total Amount` = c(5L, 5L, 4L, 2L, 1L, 5L, 1L, 1L, 3L, 5L, 
5L), Person = c("A", "B", "C", "D", "E", "A", "B", "H", "G", 
"C", "F")), .Names = c("Week", "Total Amount", "Person"), class = "data.frame",
row.names = c(NA, -11L))

這是你可以做到的一種方式:

library(tidyverse)

df <- df %>% 
  group_by(Week) %>% 
  arrange(desc(Total_Amount), .by_group = TRUE) %>% 
  mutate(id = row_number()) %>% 
  mutate(Person = case_when(id > 3 ~ "Others",
                            TRUE ~ as.character(Person)))

然后刪除$符號,這樣我們可以總計Total_Amount

df$Total_Amount <- as.numeric(gsub("\\$", "", df$Total_Amount))

最后,按組添加Total_Amount ,並添加$符號以返回所有內容:

df %>% 
  group_by(Week, Person) %>% 
  summarise(Total_Amount = sum(Total_Amount)) %>% 
  mutate(Total_Amount = paste0("$", Total_Amount)) %>% 
  select(Week, Total_Amount, Person)

哪個回報:

# A tibble: 8 x 3
# Groups:   Week [2]
   Week Total_Amount Person
  <int>        <chr>  <chr>
1     1           $5      A
2     1           $5      B
3     1           $4      C
4     1           $3 Others
5     2           $5      A
6     2           $5      C
7     2           $5      F
8     2           $5 Others

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM