简体   繁体   English

从R中的矩阵中删除重复的行

[英]Removing duplicated rows from from matrix in R

I have a matrix that is similar to 我有一个类似于

 2013  1  0    
 2013  1  30
 2013  1  100
 2013  2  0
 2013  2  30
 2013  2  100
 2013  3  0
 2013  3  30
 2013  3  100
 2013  1  0
 2013  1  30
 2013  4  0

Where there are extra columns after the third that have extra data. 在第三列之后有额外列的地方,这些列具有额外数​​据。 I need a way to remove the duplicate rows. 我需要一种删除重复行的方法。 In this example I would need to remove the rows that had a 1 in the second column. 在此示例中,我将需要删除第二列中具有1的行。 Is there a way to remove these rows while maintaining the rest of my data. 有没有办法在保留我的其余数据的同时删除这些行。

I have tried unique() and duplicate() and could not produce what I need. 我尝试过unique()和plicate(),但无法产生我需要的东西。 if I matrix was m.dat I tried using 如果我的矩阵是m.dat我尝试使用

m.dat <- m.dat[-duplicated(m.dat[,2:3])]

but that doesn't work. 但这不起作用。 Am I using duplicate wrong or is there another way to do this? 我是使用重复错误还是有另一种方法来执行此操作?

m.dat<-m.dat[m.dat[ ,2]!=1, ]

or 要么

m.dat<-m.dat[!(m.dat[ ,2]==1 & duplicated(m.dat[,1:3]) , ]

depending on what you're looking for. 取决于您要寻找的东西。 I am somewhat confused if you want to remove just records with value '1' in the second column or those with '1' and which are also duplicate rows 如果您只想删除第二列中具有值“ 1”的记录或具有“ 1”并且也是重复行的记录,我会感到困惑

if you wanted to know what numbers are repeated in that column you could use something like 如果您想知道该列中重复的数字,可以使用类似

reps<-unique(m.dat[,2][duplicated(m.dat[,2])])

and then remove all of these with a %in% statement 然后使用%in%语句删除所有这些

something like... 就像是...

m.dat<-m.dat[ ! m.dat[,2] %in% unique(m.dat[,2][duplicated(m.dat[,2])]) ,]

I was able to figure it out. 我能够弄清楚。 What I used was 我用的是

 m <- duplicated(m.dat[,3:4])
 m <- as.numeric(m)
 ind = which(m ==1)
 m.dat = m.dat[-ind,]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM