简体   繁体   中英

Pandas Panel as numpy multidimensional array

I'm new with pandas. I try to do an example within I have a grid of points in three-dimension, the dimension of grid is N1 x N2 x N3 . Each point in the grid contain some data ( numpy.ndarray , float number, etc). By example: at Point(0,0,0): 'date' = 18 , 'temperature array' = numpy.array([[1,2,3], [4,5,6]]) .

Now my aim is to create a pandas "table" (I guest a pandas.Panel ) so that when I call table['date'][Point(0,0,0)] , the results should return : 18 and when I call table['temperature array'][Point(0,0,0)] , it returns numpy.array([[1,2,3], [4,5,6]]) .

Could you post a simple example, it will help me a lots.

Thanks

Panel4D is probably what you're looking for. It essentially creates a multilevel index from your first 3 dimensions.

import pandas as pd
import numpy as np

# 100 x 100 x 100 x 3 array
data = np.random.randn(100,100,100,3)
data[0,0,0]
# out[]:
#     array([ 0.85284721, -0.04883839, -0.09375558])

table = pd.Panel4D(data)
table.ix[0,0,0]
# out[]:
#     minor
#     0    0.852847
#     1   -0.048838
#     2   -0.093756
#     Name: 0, dtype: float64

I haven't figured out how to store a numpy array in a single cell of a Pandas NDFrame, and I doubt it would be a good idea even if I succeeded. Pandas NDFrames use NumPy arrays to store the underlying data. To store a NumPy array as a value inside a NumPy array, the outer NumPy array would need to be of object dtype, which prevents NumPy (and Pandas) from applying fast numeric routines to the array. So nesting a NumPy array inside an NDFrame (such as a Panel or DataFrame) would hurt performance.

You could instead use a DataFrame with a MultiIndex:

import pandas as pd

index = pd.MultiIndex.from_tuples([(0,0,0),(0,0,1)])
table = pd.DataFrame([(18,1,2,3,4,5,6),
                      (19,10,20,30,40,50,60)], index=index,
                     columns=['date','t0','t1','t2','t3','t4','t5'])
print(table)
#        date  t0  t1  t2  t3  t4  t5
# 0 0 0    18   1   2   3   4   5   6
#     1    19  10  20  30  40  50  60

print(table.ix[(0,0,0),'date'])
# 18

print(table.ix[(0,0,1),'t0':'t5'].reshape(2,-1))
# [[10 20 30]
#  [40 50 60]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM