简体   繁体   中英

How do I reshape or pivot a DataFrame in Pandas

I would like to reshape a DataFrame in Pandas but not sure how to go about it. Here's what I'm starting with:

Phase Weight Value  CF
AA   heavy    0.28  1.0
AB   light    3.26  1.0
BX   med      0.77  1.0
XY   x light -0.01  1.0
AA   heavy    0.49  1.5
AB   light    5.10  1.5
BX   med      2.16  1.5
XY   x light  0.98  1.5
AA   heavy    2.48  2.0
AB   light   11.70  2.0
BX   med      5.81  2.0
XY   x light  3.46  2.0

I would like to reshape to this:

Phase       Weight  1.0     1.5     2.0
AA          heavy   0.28    0.49    2.48
AB          light   3.26    5.10    11.70
BX          med     0.77    2.16    5.81
XY        x light  -0.01    0.98    3.46

So column names are now the values that were in CF and the intersection of the row and columns in the new table are the values that were in the value column in the original table.

I know I can do it with the Phase column as an index like so:

df.pivot(index='Phase', columns='CF', values='Value)

But then I miss the weight column.I tried this but I'm getting an error

df.pivot(index='Phase', columns=['Weight','CF'], values='Value')

Is there a way to do this with a single statement ? If not, what is the best way ?

You can pd.pivot_table which can take multiple names as arguments to index/column parameters. I also think you want Weight on the index (which makes it a column in the output) rather than on columns (which turns the distinct values into columns).

In [27]: df.pivot_table(index=['Phase','Weight'], columns='CF', values='Value').reset_index()
Out[27]: 
CF Phase   Weight   1.0   1.5    2.0
0     AA    heavy  0.28  0.49   2.48
1     AB    light  3.26  5.10  11.70
2     BX      med  0.77  2.16   5.81
3     XY  x light -0.01  0.98   3.46

Edit:

On your other question, the .columns of a DataFrame are an Index (just like on the rows), and have a .name in addition to the actual values. As far as I'm aware, it's generally only used for display purposes.

In [74]: df.columns
Out[74]: Index([u'Phase', u'Weight', 1.0, 1.5, 2.0], dtype='object')

In [75]: df.columns.name
Out[75]: 'CF'

In [76]: df.columns.values
Out[76]: array(['Phase', 'Weight', 1.0, 1.5, 2.0], dtype=object)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM