I am trying to create a view of a multi-indexed dataframe. I am wondering why the column name remains even after the column is removed.
import panda as pd
df = pd.DataFrame({'id': [1, 2, 3, 4, 5, 6, 7, 8],
'x': [2, 2, 2, 2, 12, 12, 12, 12],
'y': [5.91, 4.43, 5.22, 1.31, 6.32, 6.78, 4.65, 1.98],
'z': [18.61, 17.60, 18.27, 16.18, 16.81, 16.37, 67.07, 46.00]})
pivot_df = df.pivot_table(index=['id'],columns=['x'],values=['y','z'])
[output]
>>> pivot_df
y z
x 2 12 2 12
id
1 5.91 NaN 18.61 NaN
2 4.43 NaN 17.60 NaN
3 5.22 NaN 18.27 NaN
4 1.31 NaN 16.18 NaN
5 NaN 6.32 NaN 16.81
6 NaN 6.78 NaN 16.37
7 NaN 4.65 NaN 67.07
8 NaN 1.98 NaN 46.00
>>> pivot_df.columns
MultiIndex(levels=[['y', 'z'], [2, 12]],
labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
names=[None, 'x'])
In the above code, I can see ['y', 'z'] at level 0 which is expected. Now I try to get rid of columns under 'z'.
new_pivot_df = pivot_df.drop('z',axis=1,level=0)
[output]
>>> new_pivot_df
y
x 2 12
id
1 5.91 NaN
2 4.43 NaN
3 5.22 NaN
4 1.31 NaN
5 NaN 6.32
6 NaN 6.78
7 NaN 4.65
8 NaN 1.98
>>> new_pivot_df.columns
MultiIndex(levels=[['y', 'z'], [2, 12]],
labels=[[0, 0], [0, 1]],
names=[None, 'x'])
In the above code, new_pivot_df shows that 'z' was dropped. However, when I check new_pivot_df.columns I still see 'z' in the column names. I would like to understand why that is the case, and I am looking for an elegant suggestion to remove a column (data AND name) from a multi-indexed dataframe.
Thank you in advance.
New in version 0.20.1 remove_unused_levels()
:
new_pivot_df.columns = new_pivot_df.columns.remove_unused_levels()
new_pivot_df.columns
Output:
MultiIndex(levels=[['y'], [2, 12]],
labels=[[0, 0], [0, 1]],
names=[None, 'x'])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.