[英]Merge rows and values in R based on condition
我有以下示例數據:
# data
school = c('ABC University','ABC Uni','DFG University','DFG U')
applicant = c(2000,3100,210,2000)
students = c(100,2000,300,4000)
df = data.frame(school,applicant,students)
我想合並到這個:
|school |appliant| students |
-----------------------------------
|ABC University| 5100 | 2100 |
|DFG University| 2210 | 4300 |
我運行了這個代碼:
df$school[df$school == 'ABC Uni'] = 'ABC University'
但它給了我兩次 ABC 大學,而不是將它們合並在一起。
這實際上取決於您的其他字符串,但您可以查看grep
並使用^
開始。
df[grep('^ABC U', df$school), 'school'] <- 'ABC University'
df[grep('^DFG U', df$school), 'school'] <- 'DFG University'
和往常一樣aggregate
。
aggregate(cbind(applicant, students) ~ school, df, sum)
# school applicant students
# 1 ABC University 5100 2100
# 2 DFG University 2210 4300
這是dplyr
stringr
解決方案:
library(dplyr)
library(stringr)
df %>%
mutate(school = str_replace_all(school, c(
"^ABC Uni$" = "ABC University",
"^DFG U$" = "DFG University"))) %>%
group_by(school) %>%
summarise(across(c(applicant, students), sum))
輸出:
school applicant students
<chr> <dbl> <dbl>
1 ABC University 5100 2100
2 DFG University 2210 4300
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.