简体   繁体   中英

Reshaping a table with Pandas/Python

Hi I have the table below and wish to reshape it:

Hi I have the table bellow in a Pandas dataframe:

    q_string    q_visits    q_date
0   nucleus         1790        2012-10-02 00:00:00
1   neuron          364         2012-10-02 00:00:00
2   current         280         2012-10-02 00:00:00
3   molecular       259         2012-10-02 00:00:00
4   stem            201         2012-10-02 00:00:00

I want to put q_date as column titles, q_string as row labels and have q_visits in the intersecting cells.

Whats the best way of doing this in Pandas/Python?

This is a typical example of a pivot_table :

>>> df.pivot_table(values='q_visits', cols='q_date', rows='q_string')
q_date     2012-10-02 00:00:00
q_string                      
current                    280
molecular                  259
neuron                     364
nucleus                   1790
stem                       201

pivot_table works, but I've used a longhand version for readability.

data = [['nucleus', 1790, '2012-10-01 00:00:00'], 
    ['neuron', 364, '2012-10-02 00:00:00'], 
    ['current', 280, '2012-10-02 00:00:00'],
    ['molecular', 259, '2012-10-02 00:00:00'], 
    ['stem', 201, '2012-10-02 00:00:00']]
df = pd.DataFrame(data, columns=['q_string', 'q_visits', 'q_date'])

    q_string  q_visits               q_date
0    nucleus      1790  2012-10-01 00:00:00
1     neuron       364  2012-10-02 00:00:00
2    current       280  2012-10-02 00:00:00
3  molecular       259  2012-10-02 00:00:00
4       stem       201  2012-10-02 00:00:00

Assign both the q_string and q_date to the index:

df.set_index(['q_string', 'q_date'], inplace=True)

The index now looks like this:

MultiIndex(levels=[['current', 'molecular', 'neuron', 'nucleus', 'stem'], 
                   ['2012-10-01 00:00:00', '2012-10-02 00:00:00']],
           labels=[[3, 2, 0, 1, 4], [0, 1, 1, 1, 1]],
           names=['q_string', 'q_date'])`

Both q_string and q_date are indexes to the date, we just unstack() it to put q_date into the column.

df.unstack()

                    q_visits                   
q_date    2012-10-01 00:00:00 2012-10-02 00:00:00
q_string                                         
current                   NaN               280.0
molecular                 NaN               259.0
neuron                    NaN               364.0
nucleus                1790.0                 NaN
stem                      NaN               201.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM