简体   繁体   English

识别R中特定列中df中的重复项

[英]Identify duplicates in a df in a particular column in R

For a sample dataframe: 对于示例数据框:

df <- structure(list(code = c("a1", "a1", "b2", "v4", "f5", "f5", "h7", 
                        "a1"), name = c("katie", "katie", "sally", "tom", "amy", "amy", 
                                        "ash", "james"), number = c(3.5, 3.5, 2, 6, 4, 4, 7, 3)), .Names = c("code", 
                                                                                                             "name", "number"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
                                                                                                                                                                                        -8L), spec = structure(list(cols = structure(list(code = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                                                             "collector")), name = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                                                                                                               "collector")), number = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                   "collector"))), .Names = c("code", "name", "number")), default = structure(list(), class = c("collector_guess", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                "collector"))), .Names = c("cols", "default"), class = "col_spec"))

I want to produce a dataframe of rows that have duplicates in one specific column only. 我只想产生一个仅在一个特定列中具有重复项的行的数据框。

I know I can do: 我知道我可以做:

df[duplicated(df),]

But for my specific larger real dataframe, I want to only specify a particular column that I want to highlight duplicates in. 但是对于特定的较大的实际数据框,我只想指定要突出显示重复项的特定列。

Any ideas? 有任何想法吗?

duplicated() accepts vectors... 重复()接受向量...

df[duplicated(df$name), ]
  code  name number
2   a1 katie    3.5
6   f5   amy    4.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用r识别并保留重复项 - identify and keep duplicates with r 在 df2 中识别 df1 中的元素,然后在 df2 中使用 R 重合的那些行中添加列 - Identify elements from df1 in df2, then add column in df2 in those rows that were coincident using R 使用 R 从单个列中识别模糊重复并创建包含模糊重复记录的子集 - Identify fuzzy duplicates from a single column and create a subset containing records of fuzzy duplicates using R 通过R中的条件识别和删除重复项 - Identify and remove duplicates by a criteria in R 使特定变量的值出现在 R df 列中,作为另一列中文本的条件 - Make a particular variable's values appear in an R df column as a condition of the text in another column 在 df 中查找重复项并用 R 分别计算它们 - Find duplicates in a df and count them separatedly with R R dplyr:如何使用组信息和编码列表中特定列的缺失条目向 df 添加额外的行? - R dplyr: How to add extra rows to a df using group info and missing entries of a particular column from a codelist? 识别重复项(两列),根据另一列求和,并将其他变量保留在 R - Identify duplicates (two columns), sum it based on another column, and keep other variables in R 如何识别R中行的部分重复 - How to identify partial duplicates of rows in R 如何识别R中的镜像重复行 - How to identify mirrored duplicates of rows in R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM