I have two data frames, one small and the other one large. The sizes are below. Basically, I expect to see the value in a column along with it's frequency and I expect a much higher number in the larger data frame as opposed to the smaller one.
> length(smalldf$col1)
[1] 5377
> length(largedf$col1)
[1] 56016
Now, when I try to find the number of unique values in each of these, I get the following. Now, this result is not as expected, I'm certain that there are many more new(unique) values in the larger data frame as compared to the smaller one.
> length(unique(smalldf$col1))
[1] 4697
> length(unique(largedf$col1))
[1] 4698
If I print out the unique values, I get largedf having all the 4697 elements as smalldf plus NA at the end.
So, I tried printing the values in the larger data frame which are not part of the smaller data frame, but I just get all my columns with NA as it's value
> library('plyr')
> a1NotIna2 <- sqldf('SELECT * FROM largedf EXCEPT SELECT * FROM smalldf')
> a1NotIna2
Just gives me all my columns with NA against it
Finally, I try to find the frequency of each value in the large data frame. I get the same result for both
You could try
largedf <- totaldataset[Reduce(`|`, lapply(totaldataset[19:43],
function(x) x=='4280')), ]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.