根據 R 中多個其他列的值進行過濾

Question

數據背景：人們在在線討論板上回應教別人

目的：根據他們是否在同一個帖子中輪流以及合作伙伴（dyad）是誰來過濾數據。 本質上，它歸結為基於其他列的值進行過濾。

具體來說，我認為它會從檢查'turntaking'是否==1開始，然后在相同的'post_id'中使用相同的'dyad_id'進行觀察。 我在如何按多個條件過濾時遇到問題。

示例數據：

structure(list(post_id = c(100, 230, 100, 100, 100, 100), dyad_id = structure(c(2L, 
2L, 2L, 1L, 1L, 1L), .Label = c("42_27", "53_27"), class = "factor"), 
    dyad_id_order = structure(c(4L, 4L, 2L, 3L, 1L, 3L), .Label = c("27_42", 
    "27_53", "42_27", "53_27"), class = "factor"), turntaking = c(0, 
    0, 1, 0, 1, 1)), class = "data.frame", row.names = c(NA, 
-6L), variable.labels = structure(character(0), .Names = character(0)), codepage = 65001L)

示例數據在視覺上看起來像：

╔═════════╦═════════╦═══════════════╦════════════╦══════════════════════════════════════════════════════════╗
║ post_id ║ dyad_id ║ dyad_id_order ║ turntaking ║ (note)                                                   ║
╠═════════╬═════════╬═══════════════╬════════════╬══════════════════════════════════════════════════════════╣
║   100   ║  53_27  ║     53_27     ║      0     ║ Keep                                                     ║
╠═════════╬═════════╬═══════════════╬════════════╬══════════════════════════════════════════════════════════╣
║   230   ║  53_27  ║     53_27     ║      0     ║ Drop                                                     ║
╠═════════╬═════════╬═══════════════╬════════════╬══════════════════════════════════════════════════════════╣
║   100   ║  53_27  ║     27_53     ║      1     ║ Keep: ID27 responded to ID53's response in the first row ║
║         ║         ║               ║            ║ (They are both found under the same post_id)             ║
╠═════════╬═════════╬═══════════════╬════════════╬══════════════════════════════════════════════════════════╣
║   100   ║  42_27  ║     42_27     ║      0     ║ Keep                                                     ║
╠═════════╬═════════╬═══════════════╬════════════╬══════════════════════════════════════════════════════════╣
║   100   ║  42_27  ║     27_42     ║      1     ║ Keep                                                     ║
╠═════════╬═════════╬═══════════════╬════════════╬══════════════════════════════════════════════════════════╣
║   100   ║  42_27  ║     42_27     ║      1     ║ Keep                                                     ║
╚═════════╩═════════╩═══════════════╩════════════╩══════════════════════════════════════════════════════════╝

最終的output如下所示：

╔═════════╦═════════╦═══════════════╦════════════╗
║ post_id ║ dyad_id ║ dyad_id_order ║ turntaking ║
╠═════════╬═════════╬═══════════════╬════════════╣
║   100   ║  53_27  ║     53_27     ║      0     ║
╠═════════╬═════════╬═══════════════╬════════════╣
║   100   ║  53_27  ║     27_53     ║      1     ║
╠═════════╬═════════╬═══════════════╬════════════╣
║   100   ║  42_27  ║     42_27     ║      0     ║
╠═════════╬═════════╬═══════════════╬════════════╣
║   100   ║  42_27  ║     27_42     ║      1     ║
╠═════════╬═════════╬═══════════════╬════════════╣
║   100   ║  42_27  ║     42_27     ║      1     ║
╚═════════╩═════════╩═══════════════╩════════════╝

Answer 1

這會查看每個post_id - dyad_id組合，並且只保留那些在某個時候具有轉折標志的組合。

  my_data %>%
    group_by(post_id, dyad_id) %>%
    filter(sum(turntaking) > 0) %>%
    ungroup()

# A tibble: 5 x 4
  post_id dyad_id dyad_id_order turntaking
    <dbl> <fct>   <fct>              <dbl>
1     100 53_27   53_27                  0
2     100 53_27   27_53                  1
3     100 42_27   42_27                  0
4     100 42_27   27_42                  1
5     100 42_27   42_27                  1

根據 R 中多個其他列的值進行過濾

問題描述

1 個解決方案

解決方案1
1 已采納 2021-02-16 01:11:16

根據 R 中多個其他列的值進行過濾

問題描述

1 個解決方案

解決方案1 1 已采納 2021-02-16 01:11:16

解決方案1
1 已采納 2021-02-16 01:11:16