简体   繁体   English

使用Pandas / Python重塑表格

[英]Reshaping a table with Pandas/Python

Hi I have the table below and wish to reshape it: 嗨,我有下表,希望重塑它:

Hi I have the table bellow in a Pandas dataframe: 嗨,我在Pandas数据框中有以下表格:

    q_string    q_visits    q_date
0   nucleus         1790        2012-10-02 00:00:00
1   neuron          364         2012-10-02 00:00:00
2   current         280         2012-10-02 00:00:00
3   molecular       259         2012-10-02 00:00:00
4   stem            201         2012-10-02 00:00:00

I want to put q_date as column titles, q_string as row labels and have q_visits in the intersecting cells. 我想将q_date作为列标题,将q_string作为行标签,并将q_visits放在相交的单元格中。

Whats the best way of doing this in Pandas/Python? 在Pandas / Python中执行此操作的最佳方法是什么?

This is a typical example of a pivot_table : 这是数据pivot_table的典型示例:

>>> df.pivot_table(values='q_visits', cols='q_date', rows='q_string')
q_date     2012-10-02 00:00:00
q_string                      
current                    280
molecular                  259
neuron                     364
nucleus                   1790
stem                       201

pivot_table works, but I've used a longhand version for readability. ivot_table可以工作,但是为了方便阅读,我使用了简化版本。

data = [['nucleus', 1790, '2012-10-01 00:00:00'], 
    ['neuron', 364, '2012-10-02 00:00:00'], 
    ['current', 280, '2012-10-02 00:00:00'],
    ['molecular', 259, '2012-10-02 00:00:00'], 
    ['stem', 201, '2012-10-02 00:00:00']]
df = pd.DataFrame(data, columns=['q_string', 'q_visits', 'q_date'])

    q_string  q_visits               q_date
0    nucleus      1790  2012-10-01 00:00:00
1     neuron       364  2012-10-02 00:00:00
2    current       280  2012-10-02 00:00:00
3  molecular       259  2012-10-02 00:00:00
4       stem       201  2012-10-02 00:00:00

Assign both the q_string and q_date to the index: 将q_string和q_date都分配给索引:

df.set_index(['q_string', 'q_date'], inplace=True)

The index now looks like this: 索引现在看起来像这样:

MultiIndex(levels=[['current', 'molecular', 'neuron', 'nucleus', 'stem'], 
                   ['2012-10-01 00:00:00', '2012-10-02 00:00:00']],
           labels=[[3, 2, 0, 1, 4], [0, 1, 1, 1, 1]],
           names=['q_string', 'q_date'])`

Both q_string and q_date are indexes to the date, we just unstack() it to put q_date into the column. q_string和q_date都是日期的索引,我们只需要对其进行unstack()即可将q_date放入列中。

df.unstack()

                    q_visits                   
q_date    2012-10-01 00:00:00 2012-10-02 00:00:00
q_string                                         
current                   NaN               280.0
molecular                 NaN               259.0
neuron                    NaN               364.0
nucleus                1790.0                 NaN
stem                      NaN               201.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM