I have a pandas dataframe and I try to calculate all euclidean distance with a fixed value and find the shortest distance.
My dataframe "currency":
Stype h line ... y y2 bc
45 currency 38 13 ... 1344 1382 (1731.0, 1363.0)
46 currency 38 13 ... 1343 1381 (2015.0, 1362.0)
47 currency 39 13 ... 1342 1381 (2267.5, 1361.5)
60 currency 39 15 ... 2718 2757 (488.0, 2737.5)
61 currency 39 15 ... 2717 2756 (813.5, 2736.5)
62 currency 39 15 ... 2718 2757 (1332.5, 2737.5)
63 currency 40 15 ... 2716 2756 (1821.5, 2736.0)
64 currency 39 15 ... 2715 2754 (2286.5, 2734.5)
68 currency 39 17 ... 2874 2913 (2287.5, 2893.5)
162 currency 30 22 ... 3311 3341 (1104.5, 3326.0)
example value in my list [l['bc']]
[(2126.5, 2657.0)]
My code:
for l in label_dic:
print('bc:', [l['bc']])
print(cdist([l['bc']], currency.bc.values, 'euclidean'))
My issue:
ValueError: XB must be a 2-dimensional array.
I have validated my function with:
print(cdist([l['bc']], [l['bc']], 'euclidean'))
Result: [[0.]]
Can you exaplin me how to fix it ?
Thanks
currency.bc.values seems to be giving a 1d numpy array of tuples, but cdist needs a 2d numpy array. you can convert it to 2d array by using np.array([*currency.bc.values])
see the example below
from scipy.spatial import distance
import pandas as pd
import numpy as np
mypoint = [(0, 0)]
df = pd.DataFrame({'coord1': [(0,10), (10,0)]})
#option 1
print(distance.cdist(mypoint , np.array([*df.coord1.values]), 'euclidean'))
#option2
print(distance.cdist(mypoint , df.coord1.values.tolist(), 'euclidean'))
results in
[[10. 10.]]
[[10. 10.]]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.