There is a SFrame with columns having dict
elements.
import graphlab
import numpy as np
a = graphlab.SFrame({'col1':[{'oshan':3,'modi':4},{'ravi':1,'kishan':5}],
'col2':[{'oshan':1,'rawat':2},{'hari':3,'kishan':4}]})
I want to calculate cosine
distance between these two columns for each row of the SFrame. Below is the operation using for loop
.
dis = np.zeros(len(a),dtype = float)
for i in range(len(a)):
dis[i] = graphlab.distances.cosine(a['col1'][i],a['col2'][i])
a['distance12'] = dis
This is very inefficient and would take hours if the number of rows was large. Could someone please suggest a better approach.
You can usually avoid looping over an SFrame by using the apply
function. In your case, it would look like this:
a.apply(lambda row: graphlab.distances.cosine(row['col1'], row['col2']))
That should be significantly faster than looping in Python.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.