简体   繁体   中英

Counting frequency of value in a column in a data frame in R not giving expected results

I have two data frames, one small and the other one large. The sizes are below. Basically, I expect to see the value in a column along with it's frequency and I expect a much higher number in the larger data frame as opposed to the smaller one.

> length(smalldf$col1)
[1] 5377
> length(largedf$col1)
[1] 56016

Now, when I try to find the number of unique values in each of these, I get the following. Now, this result is not as expected, I'm certain that there are many more new(unique) values in the larger data frame as compared to the smaller one.

> length(unique(smalldf$col1))
[1] 4697
> length(unique(largedf$col1))
[1] 4698

If I print out the unique values, I get largedf having all the 4697 elements as smalldf plus NA at the end.

So, I tried printing the values in the larger data frame which are not part of the smaller data frame, but I just get all my columns with NA as it's value

> library('plyr')
> a1NotIna2 <- sqldf('SELECT * FROM largedf EXCEPT SELECT * FROM smalldf')
> a1NotIna2

Just gives me all my columns with NA against it

Finally, I try to find the frequency of each value in the large data frame. I get the same result for both

You could try

largedf <- totaldataset[Reduce(`|`, lapply(totaldataset[19:43], 
                     function(x) x=='4280')), ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM