[英]How to combine single and multiindex Pandas DataFrames
I am trying to concatenate multiple Pandas DataFrames, some of which use multi-indexing and others use single indices. 我试图连接多个Pandas DataFrames,其中一些使用多索引,而另一些使用单个索引。 As an example, let's consider the following single indexed dataframe:
作为示例,让我们考虑以下单个索引数据帧:
> import pandas as pd
> df1 = pd.DataFrame({'single': [10,11,12]})
> df1
single
0 10
1 11
2 12
Along with a multiindex dataframe: 与多索引数据框一起:
> level_dict = {}
> level_dict[('level 1','a','h')] = [1,2,3]
> level_dict[('level 1','b','j')] = [5,6,7]
> level_dict[('level 2','c','k')] = [10, 11, 12]
> level_dict[('level 2','d','l')] = [20, 21, 22]
> df2 = pd.DataFrame(level_dict)
> df2
level 1 level 2
a b c d
h j k l
0 1 5 10 20
1 2 6 11 21
2 3 7 12 22
Now I wish to concatenate the two dataframes. 现在我希望连接两个数据帧。 When I try to use
concat
it flattens the multiindex as follows: 当我尝试使用
concat
它会将多索引展平,如下所示:
> df3 = pd.concat([df2,df1], axis=1)
> df3
(level 1, a, h) (level 1, b, j) (level 2, c, k) (level 2, d, l) single
0 1 5 10 20 10
1 2 6 11 21 11
2 3 7 12 22 12
If instead I append a single column to the multiindex dataframe df2
as follows: 相反,如果我将一个列附加到multiindex数据帧
df2
,如下所示:
> df2['single'] = [10,11,12]
> df2
level 1 level 2 single
a b c d
h j k l
0 1 5 10 20 10
1 2 6 11 21 11
2 3 7 12 22 12
How can I instead generate this dataframe from df1
and df2
with concat
, merge
, or join
? 如何使用
concat
, merge
或join
从df1
和df2
生成此数据帧?
I don't think you can avoid converting the single index into a MultiIndex
. 我不认为你可以避免将单个索引转换为
MultiIndex
。 This is probably the easiest way, you could also convert after joining. 这可能是最简单的方法,您也可以在加入后进行转换。
In [48]: df1.columns = pd.MultiIndex.from_tuples([(c, '', '') for c in df1])
In [49]: pd.concat([df2, df1], axis=1)
Out[49]:
level 1 level 2 single
a b c d
h j k l
0 1 5 10 20 10
1 2 6 11 21 11
2 3 7 12 22 12
If you're just appending one column you could access df1 essentially as a series: 如果您只是附加一列,则可以将df1作为一个系列访问:
df2[df1.columns[0]] = df1.iloc[:, 0]
df2
level 1 level 2 single
a b c d
h j k l
0 1 5 10 20 10
1 2 6 11 21 11
2 3 7 12 22 12
If you could have just made a series in the first place it would be a little easier to read. 如果你可以在第一时间制作一个系列,它会更容易阅读。 This command would do the same thing:
这个命令会做同样的事情:
ser1 = df1.iloc[:, 0] # make df1's column into a series
df2[ser1.name] = ser1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.