Dplyr R-使用distinct（）或完全不同的東西時有多個條件？

Question

我只想說，我不是完全連接到使用distinct()我的問題，我願意為應對這一切建議。 這是拼圖：

Date <- c(1,1,2,2)
Group <- c("A","A","B","B")
Result <- c("Aa","Ab","Aa","SB")
df <- cbind(Date, Group, Result)
df
     Date Group Result
[1,] "1"  "A"   "Aa"  
[2,] "1"  "A"   "Ab"  
[3,] "2"  "B"   "Aa"  
[4,] "2"  "B"   "SB"

我想要的結果是不同的Date ，因此選擇（子集）包含Aa或Ab的行之一，並選擇Aa或Ab或Ac或...上的任何包含SB的行。 對於大型數據幀，以高效的方式執行此操作會遇到很多麻煩。 我沒有高質量的展示機會。

實際上，A Group和B組有更多基於時間的觀察結果，也有更多不同的組。 如果某個特定Group在同一Date在同一Date兩次（或多次）上傳數據，則實際上只有一個Date條目具有更重要的Result 。

更新：

經過過濾等之后，來自上面的預期輸出子集：

     Date Group Result
[1,] "1"  "A"   "Aa"    
[2,] "2"  "B"   "SB"

要么

     Date Group Result
[1,] "1"  "A"   "Ab"    
[2,] "2"  "B"   "SB"

Answer 1

使用dplyr ，但沒有distinct ：

library(dplyr)

Date <- c(1,1,2,2)
Group <- c("A","A","B","B")
Result <- c("Aa","Ab","Aa","SB")
# Use data.frame, not cbind, as this produced a matrix
df <- data.frame(Date, Group, Result)

# To get your first answer
summarise(group_by(df, Date, Group), 
                   Result = first(Result))

# To get your second answer
summarise(group_by(df, Date, Group), 
                   Result = last(Result))

# To combine all the options
summarise(group_by(df, Date, Group), 
                   Result = paste(Result, collapse = ", "))

Answer 2

需要按重要性對唯一結果進行排名。 這可以手動完成，也可以使用某種算法完成。 這兩種方法如下所示。 然后，將排名結果用於每個日期組組合的排名最高的結果。 代碼可能如下所示：

  library(dplyr)
  df <- data.frame(df)
#
# manually list unique Results in order of increasing importance
#
  Result_rank <- c("Aa","Ab","SB")
#
# Or use an algorithm to rank unique Results in order of importance;
# For the example, the algorithm might be:
#
  Result_rank <- c(grep("^A",unique(df$Result), value=TRUE), 
                   grep("SB",unique(df$Result), value=TRUE))
#
# summarize by highest ranked Result for each Date and Group
#
  df_important <- df %>% group_by( Date, Group) %>%
                  summarize(Result= Result_rank[max(match(Result, Result_rank))])

這給出了結果

   Date  Group Result
  <fctr> <fctr>  <chr>
1      1      A     Ab
2      2      B     SB

Dplyr R-使用distinct（）或完全不同的東西時有多個條件？

問題描述

2 個解決方案

解決方案1
0 2016-08-22 08:59:36

解決方案2
0 2016-08-22 17:38:40

Dplyr R-使用distinct（）或完全不同的東西時有多個條件？

問題描述

2 個解決方案

解決方案1 0 2016-08-22 08:59:36

解決方案2 0 2016-08-22 17:38:40

解決方案1
0 2016-08-22 08:59:36

解決方案2
0 2016-08-22 17:38:40