简体   繁体   English

R编程:如何根据另一列的值删除一列中的重复项

[英]R programming : How to remove Duplicates in a column based on values of another column

A   B
15  O
20  O
12  C
15  C
50  C
25  O
50  O
19  O
50  M

I have a data of the above format. 我有以上格式的数据。 I want to select unique rows based on unique elements in column A But incase there are duplicates then I need to refer to column B and select the one which has code 'C' 我想根据A列中的唯一元素选择唯一行,但如果有重复,则需要引用B列并选择代码为“ C”的行

Expected Output: 预期产量:

A   B
20  O
12  C
15  C
50  C
25  O
19  O

Can anyone help.. 谁能帮忙..

We can use data.table . 我们可以使用data.table Convert the 'data.frame' to 'data.table' ( setDT(df1) ), grouped by 'A', order based on the logical condition ( B==O ), and get the first row with head 将'data.frame'转换为'data.table'( setDT(df1) ),按'A'分组,根据逻辑条件( B==O )进行order ,并获得带有head的第一行

library(data.table)
setDT(df1)[order(B=="O"), head(.SD, 1), A]
#    A B
#1: 12 C
#2: 15 C
#3: 50 C
#4: 20 O
#5: 25 O
#6: 19 O

Or this can be done with base R by order ing and get the unique elements with duplicated 或者可以通过order base Rorder base Rbase R完成操作,并获得duplicatedunique元素

df2 <- df1[order(df1$A, df1$B=="O"),]
df2[!duplicated(df2$A),]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 删除重复项,在 R 中根据另一列优先删除哪些行 - Remove duplicates, prioritising which rows to remove based on another column, in R 根据另一列删除一列中的重复项 - Remove duplicates in one column based on another column 如何在与R中另一列中的重复项相关联的列中对值进行求和? - How to sum values in a column associated with duplicates in another column in R? 根据另一列中的值从 R 中的字符列中删除字符 - Remove characters from a character column in R based on values in another column 如何保留重复项,但根据R中的列删除唯一值 - How do I keep duplicates but remove unique values based on column in R 如何基于2列中的值以及R中另一列的分组查找重复项? - How to find duplicates based on values in 2 columns but also the groupings by another column in R? 如何根据一列中的重复项和另一列中的唯一值对 R 数据框进行子集化 - How to subset R data frame based on duplicates in one column and unique values in another 如何根据另一列中的缺失数据删除重复项? - How to remove duplicates based on missing data in another column? 如何基于R中另一列中的值替换列值? - How to replace column values based on values in another column in R? 根据存在层次结构的另一列删除一列中的重复项 - Remove duplicates in one column based on another column where there is a hierarchy
相关标签
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM