简体   繁体   English

在 R 中对相似行进行分组和求和

[英]Grouping and summing similar rows in R

So I have this dataframe:所以我有这个 dataframe:

# A tibble: 268 x 7
 Age   Facebook_likes Instagram_likes Twitter_likes Tiktok_likes Reddit_likes
 <chr>          <dbl>           <dbl>         <dbl>        <dbl>        <dbl>
1 18-24              1               1             0            0            0
2 <18                0               0             0            0            0
3 18-24              1               1             1            0            0
4 18-24              0               0             0            0            0
5 18-24              0               0             0            0            0
6 25-34              0               1             0            0            0
7 18-24              1               1             0            0            0
8 18-24              0               1             0            0            0
9 25-34              0               0             0            0            1
10 18-24              1               0             0            0            0
# ... with 258 more rows, and 1 more variable:

the Age variable has only 4 observations ( <18, 18-24, 25-34, 35>). Age 变量只有 4 个观测值(<18、18-24、25-34、35>)。 What I want to do is transform this dataframe such that I only have those 4 rows with the each variable being the sum.我想要做的是转换这个 dataframe 以便我只有这 4 行,每个变量都是总和。 For example: the first grid ( first column x first row ) would have the sum of Facebook likes for those who are <18.. :例如:第一个网格(第一列 x 第一行)将具有 Facebook 喜欢 <18.. 的人的总和:

# 
   Age   Facebook_likes                     Instagram_likes Twitter_likes Tiktok_likes Reddit_likes
   <chr>          <dbl>                              <dbl>         <dbl>        <dbl>        <dbl>
 1 <18    sum(facebook_likes for those <18)               
 2 18-24                
 3 25-34            
 4 >35           
 

We can use summarise with across in tidyverse after grouping by 'Age'在按“年龄”分组后,我们可以across tidyverse中使用summarise和交叉

library(dplyr)
df1 %>%
  group_by(Age) %>%
  summarise(across(where(is.numeric), sum, na.rm = TRUE))

data.table data.table

library(data.table)

cols_likes <- grep("_likes$", names(df), value = TRUE)

or或者

cols_likes <- sapply(df, is.numeric)

setDT(df)[, lapply(.SD, sum, na.rm = TRUE), by = Age, .SDcols = cols_likes]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM