[英]Summarizing the count of values of one column in 2 different columns
我有一個名為reviews_gh
的df,格式如下
Date Market Positive.or.Negative.
01-01-2020 A Positive
01-01-2020 A Positive
01-01-2020 B Positive
01-01-2020 B Negative
....
我正在嘗試按日期和業務分組,並創建一個名為正面和負面的新列,它總結了當天在該市場中負面和正面的次數
這是我現在擁有的代碼
reviews_gh_agg <- reviews_gh %>%
group_by(Date, Market) %>%
summarise(positive = sum(reviews_gh$Positive.or.Negative.=="Positive"), negative =
sum(reviews_gh$Positive.or.Negative.=="Negative") )
但是我得到的結果是錯誤的,我在新的正負列上得到所有觀察的總和,而不是按天和市場分組
頂部小例子的結果應該是
Date Market Positive Negative
01-01-2020 A 2 0
01-01-2020 B 1 1
謝謝您的幫助
我希望這就是你要找的。 我只是對您的代碼進行了輕微修改,因為由於數據屏蔽,您不需要$
來引用tidyverse
中的列名。
df %>%
group_by(Date, Market) %>%
summarise(positive = sum(Positive.or.Negative.=="Positive"), negative =
sum(Positive.or.Negative.=="Negative"))
# A tibble: 2 x 4
# Groups: Date [1]
Date Market positive negative
<chr> <chr> <int> <int>
1 01-01-2020 A 2 0
2 01-01-2020 B 1 1
更新了親愛的@akrun 的另一個有價值的解決方案。
df %>%
group_by(Date, Market) %>%
summarise(out = list(table(Positive.or.Negative.)), .groups = "drop") %>%
unnest_wider(c(out))
# A tibble: 2 x 4
Date Market Positive Negative
<chr> <chr> <int> <int>
1 01-01-2020 A 2 NA
2 01-01-2020 B 1 1
日期
df <- tribble(
~Date, ~Market, ~Positive.or.Negative.,
"01-01-2020", "A", "Positive",
"01-01-2020", "A", "Positive",
"01-01-2020", "B", "Positive",
"01-01-2020", "B", "Negative"
)
這是另一個tidyverse
解決方案,使用count
和pivot_wider
。
library(tidyverse)
df %>%
# Group by Date, Market and Positive/Negative
group_by(Date, Market, Positive.or.Negative.) %>%
# Count
count() %>%
# Change to wide format, fill NA with 0's
pivot_wider(names_from = Positive.or.Negative.,
values_from = n,
values_fill = 0)
您可以使用tidyr::pivot_wider
執行此操作:
tidyr::pivot_wider(df, names_from = Positive.or.Negative.,
values_from = Positive.or.Negative.,
values_fn = length,
values_fill = 0)
# Date Market Positive Negative
# <chr> <chr> <int> <int>
#1 01-01-2020 A 2 0
#2 01-01-2020 B 1 1
並使用data.table
:
library(data.table)
dcast(setDT(df), Date + Market~Positive.or.Negative.,
value.var = 'Positive.or.Negative.', fun.aggregate = length)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.