[英]How to add new column in dataframe with count if formula in R PROGRAMMING
這是我的代碼:第一列是“案例 ID”,第二列是“消息”。 我想在數據框中添加一個新列,在“消息”列中添加“歡迎”計數。
Case_id<-c("#1","#1","#1","#1","#1","#2","#2","#2","#2","#2")
message<-c("welcome to dell","welcome to dell","refresh your screen","connect to agent","Thanks good day","welcome to dell","select from default","refresh your screen","connect to agent","Thanks good day")
df <- data.frame(Case_id, message)
sum(df$Case_id=="#1" & grepl("welcome*",df$message))
預期 output:
Case id message Welcome message repeated
#1 "welcome to..." 2
#1 "welcome to..." 2
#1 ""refresh your...." 2
#1 ""connect to agent" 2
#1 "Thanks good day" 2
#2 "Welcome to..." 1
#2 ...... 1
#2
#2
#2
對於每個組(即Case_id
),我們可以計算組中出現“歡迎”的次數。
library(tidyverse)
df %>%
group_by(Case_id) %>%
mutate("Welcome message repeated" = sum(str_detect(message, "welcome*")))
或者在基礎 R 中:
transform(df, "Welcome message repeated" = ave(+(grepl("welcome*", message)), Case_id, FUN = sum))
Output
Case_id message `Welcome message repeated`
<chr> <chr> <int>
1 #1 welcome to dell 2
2 #1 welcome to dell 2
3 #1 refresh your screen 2
4 #1 connect to agent 2
5 #1 Thanks good day 2
6 #2 welcome to dell 1
7 #2 select from default 1
8 #2 refresh your screen 1
9 #2 connect to agent 1
10 #2 Thanks good day 1
另一種選擇是您可以使用summarise
來獲取每個組的匯總計數。
df %>%
group_by(Case_id) %>%
summarise("Welcome message repeated" = sum(str_detect(message, "welcome*")))
# Case_id `Welcome message repeated`
# <chr> <int>
#1 #1 2
#2 #2 1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.