[英]Pandas Dataframe MultiIndex transform one level of the multiindex to another axis while keeping the other level in the original axis
I have a Pandas Dataframe
with MultiIndex
in the row indexers like this: 我在这样的行索引器中有一个带有MultiIndex
的Pandas Dataframe
:
This dataframe is a result of a groupby
operation and then slicing from a 3-level MultiIndex
.I would like the 'date' row indexer to remain, but shift the 'SlabType' level of row indexers into column indexer with non-available values as NaN
. 该数据帧是groupby
操作的结果,然后从3级MultiIndex
。我希望保留“日期”行索引器,但将行索引器的“ SlabType”级别转移到具有不可用值的列索引器中NaN
。
This is what I would like to get to: 这就是我想要得到的:
What operations do I need to do to achieve this? 为此,我需要执行哪些操作? Also if the title of the question can be improved, please suggest so. 另外,如果可以改善问题的标题,请提出建议。
Use unstack
with select column SlabLT
: 使用unstack
与选择列SlabLT
:
print (df['SlabLT'].unstack())
But if possible duplicates in MultiIndex
is necessary aggregate column, ag by mean
: 但是,如果可能的重复MultiIndex
是必要的聚合列,由股份公司mean
:
print (df.groupby(level=[0,1])['SlabLT'].mean().unstack())
Sample : 样品 :
df = pd.DataFrame({'date':['2017-10-01','2017-10-08','2017-10-08','2017-10-15', '2017-10-15'],
'SlabType':['UOM2','AMOUNT','UOM2','AMOUNT','AMOUNT'],
'SlabLT':[1,6000,1,6000,5000]}).set_index(['date','SlabType'])
print (df)
SlabLT
date SlabType
2017-10-01 UOM2 1
2017-10-08 AMOUNT 6000
UOM2 1
2017-10-15 AMOUNT 6000 <-duplicated MultiIndex '2017-10-15', 'AMOUNT'
AMOUNT 5000 <-duplicated MultiIndex '2017-10-15', 'AMOUNT'
print (df['SlabLT'].unstack())
ValueError: Index contains duplicate entries, cannot reshape ValueError:索引包含重复的条目,无法重塑
print (df.groupby(level=[0,1])['SlabLT'].mean())
date SlabType
2017-10-01 UOM2 1
2017-10-08 AMOUNT 6000
UOM2 1
2017-10-15 AMOUNT 5500
Name: SlabLT, dtype: int64
print (df.groupby(level=[0,1])['SlabLT'].mean().unstack())
SlabType AMOUNT UOM2
date
2017-10-01 NaN 1.0
2017-10-08 6000.0 1.0
2017-10-15 5500.0 NaN
Since you have NaN
values for some entries, you may want to consider pivot table to avoid "duplicate entries" ValueError when unstacking one of the indices. 由于某些条目具有NaN
值,因此您可能需要考虑使用数据透视表来避免在堆积索引之一时出现“重复的条目” ValueError。
Suppose you have df
DataFrame with column 'SlabLT'
with indices date
and SlabType
, try: 假设您的df
DataFrame的列为'SlabLT'
,索引为date
和SlabType
,请尝试:
df.reset_index().pivot_table(values = 'SlabLT', index = 'date', columns = 'SlabLT')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.