简体   繁体   中英

Python Pivot Table without Pandas?

I want have a array like this:

[['1.6.2', '2016-01-11', 10], ['1.6.2', '2016-01-12', 100], ['1.6.2', '2016-01-13', 200], ['1.6.3', '2016-01-11', 300], ['1.6.3', '2016-01-12', 10], ['1.6.3', '2016-01-13', 21]]

I want to end up with something that looks like:

AV          1.6.2  1.6.3
DAY
2016-01-11     10    300
2016-01-12    100     10
2016-01-13    200     21

So the out array would be like :

[[“Day”, “1.6.2”, “1.6.3”], [“2016-01-11”, 10, 300], [2015-01-12, 100, 10, [‘2016-01-13, 200, 21’]]]

I can do this in pandas:

df = pd.DataFrame(dd, columns=['AV', 'DAY', 'COUNT'])

pivot_table(df, values='COUNT', rows='DAY', cols='AV')

HOWEVER, I cannot use Pandas as it's not supported by Appengine...

I am in need of help in how to implement this. I can use other libraries that are Python only, or Numpy.

Any help would be great!

Thanks.

May we assume the value-combinations are unique? ie combination '1.6.2', '2016-01-11' at most only appears once? If so, we can use a lookup dictionary:

In [69]:

#make it into an array
a_dd  = np.array(dd)
a_dd
Out[69]:
array([['1.6.2', '2016-01-11', '10'],
       ['1.6.2', '2016-01-12', '100'],
       ['1.6.2', '2016-01-13', '200'],
       ['1.6.3', '2016-01-11', '300'],
       ['1.6.3', '2016-01-12', '10'],
       ['1.6.3', '2016-01-13', '21']], 
      dtype='|S10')
In [70]:

#the combinations of unique values:
#get the uniques
u_av  = np.unique(np.array(a_dd[:,1]))
u_day = np.unique(np.array(a_dd[:,0]))
​
#the combinations of unique values:
list(itertools.product(u_av, 
                       u_day))
Out[70]:
[('2016-01-11', '1.6.2'),
 ('2016-01-11', '1.6.3'),
 ('2016-01-12', '1.6.2'),
 ('2016-01-12', '1.6.3'),
 ('2016-01-13', '1.6.2'),
 ('2016-01-13', '1.6.3')]

Dictionary lookup:

In [71]:

#a lookup distionary
D = dict(zip(map(tuple, a_dd[:,[0,1]]),
             a_dd[:,[2]].ravel().tolist()))
D
Out[71]:
{('1.6.2', '2016-01-11'): '10',
 ('1.6.2', '2016-01-12'): '100',
 ('1.6.2', '2016-01-13'): '200',
 ('1.6.3', '2016-01-11'): '300',
 ('1.6.3', '2016-01-12'): '10',
 ('1.6.3', '2016-01-13'): '21'}
In [72]:

in 
#pivot table
#if a certain combination is not found, it will result in a None
np.array(map(D.get, itertools.product(u_day, 
                                      u_av))).reshape(len(u_day), len(u_av))
Out[72]:
array([['10', '100', '200'],
       ['300', '10', '21']], 
      dtype='|S3')

We have to first generate a cartesian product of index and columns for the resulting pivot_table. (In pandas it is pandas.tools.util.cartesian_product ). Without pandas we just have to reinvent the wheel, with the standard library ( itertools )...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM