Substract two data.frames in R, by characters

Question

I have a data set of 250000+ rows.

Three columns: country, test and test_result (character, character, numerical)

The next line off code reduce my data to 102388 rows.

sub.df1 <- df <- df[!duplicated(df), ]

This line off code reduce my data to 102339 rows.

sub.df2 <- unique(df[,c('country','test')])

Now i want to see these 49 rows. These rows containing the same country and test but have a different test_result.(in sub.df1)

I was trying to substract the sub.df1[1:2] - sub.df2 = sub.df3 Here sub.df2 are the 49 combinations of country and test who are appearing more then once in sub.df1.

Also tried some other approaches to reach my goal; merge(), match(), table(), rle(), but none of them sounds to fit on my problem.

Kind regards, Brecht

Answer 1

If you just want to get the difference, you can use duplicated .

df[duplicated(df[, c('country', 'test')]), ]

If you want to get all the duplicates as well, you could use eg data.table .

require(data.table)
setDT(df)
setkeyv(df, c('country', 'test'))
df[df[duplicated(df[, list(country, test)]), list(country, test)], ]

Substract two data.frames in R, by characters

Question

1 answers

solution1
1 ACCPTED 2015-02-26 07:57:53

Substract two data.frames in R, by characters

Question

1 answers

solution1 1 ACCPTED 2015-02-26 07:57:53

solution1
1 ACCPTED 2015-02-26 07:57:53