Counting frequency of value in a column in a data frame in R not giving expected results

Question

I have two data frames, one small and the other one large. The sizes are below. Basically, I expect to see the value in a column along with it's frequency and I expect a much higher number in the larger data frame as opposed to the smaller one.

> length(smalldf$col1)
[1] 5377
> length(largedf$col1)
[1] 56016

Now, when I try to find the number of unique values in each of these, I get the following. Now, this result is not as expected, I'm certain that there are many more new(unique) values in the larger data frame as compared to the smaller one.

> length(unique(smalldf$col1))
[1] 4697
> length(unique(largedf$col1))
[1] 4698

If I print out the unique values, I get largedf having all the 4697 elements as smalldf plus NA at the end.

So, I tried printing the values in the larger data frame which are not part of the smaller data frame, but I just get all my columns with NA as it's value

> library('plyr')
> a1NotIna2 <- sqldf('SELECT * FROM largedf EXCEPT SELECT * FROM smalldf')
> a1NotIna2

Just gives me all my columns with NA against it

Finally, I try to find the frequency of each value in the large data frame. I get the same result for both

Answer 1

You could try

largedf <- totaldataset[Reduce(`|`, lapply(totaldataset[19:43], 
                     function(x) x=='4280')), ]

Counting frequency of value in a column in a data frame in R not giving expected results

Question

1 answers

solution1
3 ACCPTED 2015-01-18 11:29:07

Counting frequency of value in a column in a data frame in R not giving expected results

Question

1 answers

solution1 3 ACCPTED 2015-01-18 11:29:07

solution1
3 ACCPTED 2015-01-18 11:29:07