如何在R data.frame中的所有行和列中查找检测单个值的重复项

Question

I have a large data-set consisting of a header and a series of values in that column.我有一个包含标题和该列中的一系列值的大型数据集。 I want to detect the presence and number of duplicates of these values within the whole dataset.我想检测整个数据集中这些值的存在和重复的数量。

1     2     3     4     5     6     7
734  456   346   545   874   734   455
734  783   482   545   456   948   483

So for example, it would detect 734 3 times, 456 twice etc.例如，它会检测 734 3 次，456 两次等。

I've tried using the duplicated function in r but this seems to only work on rows as a whole or columns as a whole.我试过在 r 中使用重复的函数，但这似乎只适用于整个行或整个列。 Using使用

duplicated(df)

doesn't pick up any duplicates, though I know there are two duplicates in the first row.没有选择任何重复项，尽管我知道第一行中有两个重复项。

So I'm asking how to detect duplicates both within and between columns/rows.所以我问如何检测列/行内和列/行之间的重复项。

Cheers干杯

Answer 1

You can use table() and data.frame() to see the occurrence您可以使用table()和data.frame()来查看发生情况

data.frame(table(v))

such that以至于

     v Freq
1    1    1
2    2    1
3    3    1
4    4    1
5    5    1
6    6    1
7    7    1
8  346    1
9  455    1
10 456    2
11 482    1
12 483    1
13 545    2
14 734    3
15 783    1
16 874    1
17 948    1

DATA数据

v <- c(1, 2, 3, 4, 5, 6, 7, 734, 456, 346, 545, 874, 734, 455, 734, 
783, 482, 545, 456, 948, 483)

Answer 2

You can transform it to a vector and then use table() as follows:您可以将其转换为向量，然后使用table()如下：

library(data.table)
library(dplyr)
df<-fread("734  456   346   545   874   734   455
734  783   482   545   456   948   483")

df%>%unlist()%>%table()
# 346 455 456 482 483 545 734 783 874 948 
# 1   1   2   1   1   2   3   1   1   1

如何在R data.frame中的所有行和列中查找检测单个值的重复项

问题描述

2 个解决方案

解决方案1
2 已采纳 2019-12-16 12:28:12

解决方案2
1 2019-12-16 12:29:51

如何在R data.frame中的所有行和列中查找检测单个值的重复项

问题描述

2 个解决方案

解决方案1 2 已采纳 2019-12-16 12:28:12

解决方案2 1 2019-12-16 12:29:51

解决方案1
2 已采纳 2019-12-16 12:28:12

解决方案2
1 2019-12-16 12:29:51