I have an ff dataframe windows_ff:
edge ipaddr port protocol windowed_qd class
1 1182430570 41.2.194.42 1299 1 0 WEB
2 1182430570 41.2.194.42 1302 1 0 WEB
I want to find a mutual relation among its rows, so I decided to make an exact copy of that dataframe:
outgoing_windows_ff_1 <- ffdf(edge=outgoing_windows_ff$edge,
ipaddr=outgoing_windows_ff$ipaddr,
influencing_port=outgoing_windows_ff$port,
influencing_proto=outgoing_windows_ff$proto,
influencing_class=outgoing_windows_ff$class)
and then merge the 2 dataframes:
merged <- merge(x=outgoing_windows_ff, y=outgoing_windows_ff_1,
by.x=c('edge','ipaddr'),by.y=c('edge','ipaddr') )
The result is:
edge ipaddr port protocol windowed_qd class influencing_port
1 1182430570 41.2.194.42 1299 1 0 WEB 1299
2 1182430570 41.2.194.42 1302 1 0 WEB 1299
but it is WRONG, because I would expect 4 rows in the result.
Doing the merge between normal dataframes:
merged <- merge(x=as.data.frame(outgoing_windows_ff),
y=as.data.frame(outgoing_windows_ff_1),
by.x=c('edge','ipaddr'),by.y=c('edge','ipaddr') )
I get the correct result:
edge ipaddr port protocol windowed_qd class influencing_port influencing_proto
1 1182430570 41.2.194.42 1299 1 0 WEB 1299 1
2 1182430570 41.2.194.42 1299 1 0 WEB 1302 1
3 1182430570 41.2.194.42 1302 1 0 WEB 1299 1
4 1182430570 41.2.194.42 1302 1 0 WEB 1302 1
I think that is really DANGEROUS that a certain operation gives 2 different results if ff dataframes or "normal dataframes" are used. This can lead to poisoned results and the experimenter cannot know about it. My doubt is: "maybe other results that I obtained with ff package are poisoned and I didn't realize"
Have your read the documentation of merge.ffdf from package ffbase, which is the function you are using?
It says:
This method is similar as merge in the base package but only allows inner and left outer joins . Mark that joining is done based on ffmatch or ffdfmatch, meaning that only the * first * element in y will be added to x and ffdfmatch works on link[base]{ paste }-ing together a key. So this might not be suited if your key contains columns of vmode double.
Mark what is highlighted in bold. What you are doing with merge.ffdf is a full outer join which is not supported by merge.ffdf . Mark the word 'first' in the documentation. Also mark that it paste 's together a key.
If you are in need of code which performs a full outer join, feel free to push code which does a full outer join which works on ff objects on the github repository of ffbase: https://github.com/edwindj/ffbase
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.