I have an ndarray
created by a cKDTree, like this:
idx = array([[2941, 4837, 3593],
[ 323, 3209, 3649]])
and I'd like to use that to create a pandas DataFrame
using those as indices to another data frame that has some other symbols, for example:
2941, A
4837, B
3593, C
323, D
3209, E
3649, F
And, using something like gdf = pd.DataFrame(idx)
I'd like to have a DataFrame
idx_0 idx_1 idx_2
0 A B C
1 D E F
instead of
idx_0 idx_1 idx_2
0 2941 4837 3593
1 323 3209 3649
How do I do that with a multidimensional array? df.loc[idx]
won't work.
Use Series.map
with apply
for all columns of DataFrame
:
s = df.set_index('a')['b']
print (s)
a
2941 A
4837 B
3593 C
323 D
3209 E
3649 F
Name: b, dtype: object
idx = np.array([[2941, 4837, 3593],
[ 323, 3209, 3649]])
gdf = pd.DataFrame(idx).apply(lambda x: x.map(s))
print (gdf)
0 1 2
0 A B C
1 D E F
You could use applymap :
lookup = dict(zip(df[0], df[1]))
result = pd.DataFrame(idx).applymap(lookup.get)
print(result)
Output
0 1 2
0 A B C
1 D E F
Assuming df
is:
0 1
0 2941 A
1 4837 B
2 3593 C
3 323 D
4 3209 E
5 3649 F
As an alternative, given that idx
is a numpy array, you could map using numpy.vectorize , and then build the DataFrame:
result = pd.DataFrame(np.vectorize(lookup.get)(idx))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.