从data.frames列表中删除data.frames中特定的重复观测值

Question

I have a list of data.frames that looks like this: 我有一个data.frames列表，看起来像这样：

  $`42`
     Val     Replicate Index     
   26.92        R2    42
   26.92        R3    42
   28.68        R1    42
   28.68        R4    42

 $`43`
    Val      Replicate Index
  28.92        R3    43
  29.28        R2    43
  30.11        R1    43
  30.11        R4    43

 $`44`
    Val  Replicate Index
   24.67       R3    44
   24.70       R2    44
   25.70       R1    44
   25.70       R4    44   

 $`45`
    Val  Replicate Index
  30.57       R1    45
  30.57       R4    45
  32.39       R2    45
  32.81       R3    45

What I would like to do is the following: if in the column "Val" there's a duplicated element with respect to R4 in column "Replicate", they both will be removed from column "Val". 我要执行的操作如下：如果“ Val”列中的“ Replicate”列中存在与R4有关的重复元素，则它们都将从“ Val”列中删除。

For example, in the data.frame named 45 , since 30.57 (R1) is equal to 30.57 (R4), they both will be removed retaining only 32.39 (R2) and 32.81 (R3). 例如，在名为45的data.frame中，由于30.57（R1）等于30.57（R4），它们都将被删除，仅保留32.39（R2）和32.81（R3）。 So the desired output for data.frame 45 would be: 因此，data.frame 45的期望输出为：

 $`45`
    Val  Replicate Index
  32.39       R2    45
  32.81       R3    45

I tried to use: 我尝试使用：

lapply(mydf, function(x) x[!duplicated(x[c("Val")]), ])

but unfortunately it removes all duplicated elements in the column "Val", not with respect to the comparison with R4 in column "Replicate". 但不幸的是，它删除了“ Val”列中的所有重复元素，而不是与“ Replicate”列中与R4的比较而言。

Answer 1

If this is only related to "R4" values, then 如果这仅与“ R4”值有关，则

df[!((duplicated(df$Val) |  duplicated(df$Val, fromLast=TRUE)) &
   df$Val[df$Replicate == "R4"]), ]

will keep the non-duplicate observations as well as all non-R4 observations for some data.frame df. 将保留某些数据的非重复观测值以及所有非R4观测值。 Then, to put this into lapply 然后，把它放进lapply

mynonDupeList <- lapply(myList,
                    function(i) i[!((duplicated(df$Val) | duplicated(i$Val, fromLast=TRUE))
                                  & i$Val[i$Replicate == "R4"]), ]))

should do the trick. 应该可以。

从data.frames列表中删除data.frames中特定的重复观测值

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-02-22 13:36:53

从data.frames列表中删除data.frames中特定的重复观测值

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-02-22 13:36:53

解决方案1
2 已采纳 2017-02-22 13:36:53