I have the following 3x3x3 (3 rows, 3 columns with 3 elements in each cell) numpy array...
[[[1, 1, 19],
[2, 2, 29],
[3, 3, 39]],
[[4, 4, 49],
[1, 1, 19],
[2, 2, 29]],
[[3, 3, 39],
[9, 9, 99],
[8, 8, 89]]]
and the following pandas dataframe...
col0 col1 col2 col3
1 1 19 10
2 2 29 20
3 3 39 30
4 4 49 40
8 8 89 80
9 9 99 90
I want to generate a new pandas data frame using values from col3, that matches each 3 element array (eg [1, 1, 19] or [4, 4. 49]) with col0, col1, col3.
Order of the 3 element array is important, the first element must match to col0, and second to col1 and so on.
The resulting data frame would look like the following...
colA colB colC
10 20 30
40 10 20
30 90 80
Call the array needles
and the DataFrame haystack
. First, index the haystack:
haystack.set_index(['col0', 'col1', 'col2'], inplace=True)
Now you can get the values for the first set of needles:
haystack.loc[list(map(tuple, needles[0]))]
This gives you the first row of your solution (in col3
):
col3
col0 col1 col2
1 1 19 10
2 2 29 20
3 3 39 30
Finally, do that for every 3x3 array along the first axis of needles
:
pd.DataFrame(haystack.loc[list(map(tuple, pin))].col3.values for pin in needles)
This gives you the result:
0 1 2
0 10 20 30
1 40 10 20
2 30 90 80
An alternative which may or may not be faster:
pd.DataFrame(haystack.col3[pd.MultiIndex.from_arrays(pin.T)].values for pin in needles)
The map
or MultiIndex.from_arrays()
is needed because unfortunately Pandas doesn't allow MultiIndex lookups by 2D arrays--only by lists (or arrays) of tuples. For more on that, see: Pandas MultiIndex lookup with Numpy arrays
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.