[英]Python: concatenate pandas multiindex
I need to generate a pd.DataFrame with columns being composed by a list and a Multiindex object, and I need to do it before filling the final dataframe with data.我需要生成一个 pd.DataFrame,其中的列由一个列表和一个 Multiindex 对象组成,我需要在用数据填充最终数据帧之前执行此操作。
Say the columns are ['one', 'two']
and the multiindex obtained from from_product
:假设列是
['one', 'two']
和从from_product
获得的多from_product
:
import pandas as pd
col_21 = ['day', 'month']
col_22 = ['a', 'b']
mult_2 = pd.MultiIndex.from_product([ col_21, col_22 ])
I would like to get a list of columns which looks like this:我想获得如下所示的列列表:
'one' | 'two' | ('day','a') | ('day','b') | ('month','a') | ('month','b')
One possible solution would be to use two different and separate Multiindex, one with a dummy column, both generate by from_product
一种可能的解决方案是使用两个不同且独立的 Multiindex,一个带有虚拟列,均由
from_product
生成
col_11 = ['one', 'two']
col_12 = ['']
col_21 = ['day', 'month']
col_22 = ['a', 'b']
mult_1 = pd.MultiIndex.from_product([ col_11, col_12 ])
mult_2 = pd.MultiIndex.from_product([ col_21, col_22 ])
How could I get to this?我怎么能做到这一点?
(one, '') | (two, '') | ('day','a') | ('day','b') | ('month','a') | ('month','b')
I have tried several trivial solutions, but each gave me a different error or a wrong result我尝试了几个简单的解决方案,但每个都给了我不同的错误或错误的结果
mult_1+mult_2 #TypeError: cannot perform __add__ with this index type: MultiIndex
pd.merge #TypeError: Can only merge Series or DataFrame objects, a <class 'list'> was passed
pd.MultiIndex.from_arrays([ mult_1, mult_2 ]) #NotImplementedError: isna is not defined for MultiIndex
Thank you for your advices谢谢你的建议
If possible, the best is create MultiIndex in index
by columns one
, two
and then MultiIndex in columns
by pairs - so not mixed non multiindex with multindex values:如果可能的话,最好是建立
MultiIndex in index
按列one
, two
,然后MultiIndex in columns
通过对-所以不混合非多指标与multindex值:
col_21 = ['day', 'month']
col_22 = ['a', 'b']
mult_2 = pd.MultiIndex.from_product([ col_21, col_22 ])
one = range(5)
two = list('ABCDE')
mult_3 = pd.MultiIndex.from_arrays([ one, two], names=['one','two'])
df = pd.DataFrame(0, columns=mult_2, index=mult_3)
print (df)
day month
a b a b
one two
0 A 0 0 0 0
1 B 0 0 0 0
2 C 0 0 0 0
3 D 0 0 0 0
4 E 0 0 0 0
Use Index.append
:使用
Index.append
:
print (mult_1.append(mult_2))
MultiIndex([( 'one', ''),
( 'two', ''),
( 'day', 'a'),
( 'day', 'b'),
('month', 'a'),
('month', 'b')],
)
Or Index.union
with sort=False
:或带有
sort=False
Index.union
:
print (mult_1.union(mult_2, sort=False))
MultiIndex([( 'one', ''),
( 'two', ''),
( 'day', 'a'),
( 'day', 'b'),
('month', 'a'),
('month', 'b')],
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.