簡體   English   中英

總結 2 個不同列中一列的值的計數

[英]Summarizing the count of values of one column in 2 different columns

我有一個名為reviews_gh的df,格式如下

Date         Market  Positive.or.Negative.
01-01-2020     A              Positive
01-01-2020     A              Positive
01-01-2020     B              Positive
01-01-2020     B              Negative
....

我正在嘗試按日期和業務分組,並創建一個名為正面和負面的新列,它總結了當天在該市場中負面和正面的次數

這是我現在擁有的代碼

  reviews_gh_agg <- reviews_gh %>% 
  group_by(Date, Market) %>% 
  summarise(positive = sum(reviews_gh$Positive.or.Negative.=="Positive"), negative = 
  sum(reviews_gh$Positive.or.Negative.=="Negative") )

但是我得到的結果是錯誤的,我在新的正負列上得到所有觀察的總和,而不是按天和市場分組

頂部小例子的結果應該是

    Date         Market  Positive     Negative
01-01-2020     A            2            0
01-01-2020     B            1            1         

謝謝您的幫助

我希望這就是你要找的。 我只是對您的代碼進行了輕微修改,因為由於數據屏蔽,您不需要$來引用tidyverse中的列名。

df %>% 
  group_by(Date, Market) %>% 
  summarise(positive = sum(Positive.or.Negative.=="Positive"), negative = 
              sum(Positive.or.Negative.=="Negative"))


# A tibble: 2 x 4
# Groups:   Date [1]
  Date       Market positive negative
  <chr>      <chr>     <int>    <int>
1 01-01-2020 A             2        0
2 01-01-2020 B             1        1

更新了親愛的@akrun 的另一個有價值的解決方案。

df %>%
  group_by(Date, Market) %>%
  summarise(out = list(table(Positive.or.Negative.)), .groups = "drop") %>%
  unnest_wider(c(out))

# A tibble: 2 x 4
  Date       Market Positive Negative
  <chr>      <chr>     <int>    <int>
1 01-01-2020 A             2       NA
2 01-01-2020 B             1        1

日期

df <- tribble(
  ~Date,         ~Market,  ~Positive.or.Negative.,
  "01-01-2020",     "A",              "Positive",
  "01-01-2020",     "A",              "Positive",
  "01-01-2020",     "B",              "Positive",
  "01-01-2020",     "B",              "Negative"
)

這是另一個tidyverse解決方案,使用countpivot_wider

library(tidyverse)

df %>% 
  # Group by Date, Market and Positive/Negative
  group_by(Date, Market, Positive.or.Negative.) %>%
  # Count
  count() %>%
  # Change to wide format, fill NA with 0's
  pivot_wider(names_from = Positive.or.Negative.,
              values_from = n,
              values_fill = 0)

您可以使用tidyr::pivot_wider執行此操作:

tidyr::pivot_wider(df, names_from = Positive.or.Negative., 
                       values_from = Positive.or.Negative., 
                       values_fn = length, 
                       values_fill = 0)

#  Date       Market Positive Negative
#  <chr>      <chr>     <int>    <int>
#1 01-01-2020 A             2        0
#2 01-01-2020 B             1        1

並使用data.table

library(data.table)

dcast(setDT(df),  Date + Market~Positive.or.Negative., 
      value.var = 'Positive.or.Negative.', fun.aggregate = length)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM