簡體   English   中英

dplyr總結邏輯條件

[英]dplyr summarise logical condition

我有以下數據框

df <- data.frame(Gender = c(rep(c("M","F"),each=4)),
             DiffA=c(1,1,-1,-1,1,1,1,-1),
             DiffB=c(1,-1,1,-1,1,1,1,-1))

我想創建2個新變量,針對每個性別進行總結:i)DiffA和DiffB為正的行數; ii)DiffA和DiffB為負的行數,以便獲得:

df2 <- data.frame(Gender = c("M","F"),
             Diff_Pos=c(1,3),
             Diff_Neg=c(1,1))

我未能結合來自dplyr n()的摘要函數,該函數返回行數和所需的邏輯語句。 提前致謝

我會考慮做

library(tidyr)
df %>% filter(DiffA == DiffB) %>% count(Gender, DiffA) %>% spread(DiffA, n)

  Gender    -1     1
#   (fctr) (int) (int)
# 1      F     1     3
# 2      M     1     1

類似的data.table代碼是

dcast(df[DiffA == DiffB, .N, by=.(Gender, DiffA)], Gender ~ DiffA)

#    Gender -1 1
# 1:      F  1 3
# 2:      M  1 1

如果實際數據超出-11 ,則將相關列包裝在sign()

這是base R選項

 with(subset(df, DiffA==DiffB), table(Gender, DiffA))
 #      DiffA
 #Gender -1 1
 #     F  1 3
 #     M  1 1

這應該工作:

df %>% 
  dplyr::mutate(
    Diff_Pos = DiffA > 0 & DiffB > 0,
    Diff_Neg = DiffA < 0 & DiffB < 0) %>% 
  dplyr::group_by(Gender) %>% 
  dplyr::summarise(
    Diff_Pos = sum(Diff_Pos),
    Diff_Neg = sum(Diff_Neg))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM