熊猫-串联两个多索引数据框

Question

I am having a dataframe as follows: 我有一个数据框，如下所示：

df.head()
                Student Name            Q1  Q2  Q3
Month   Roll No             
2016-08-01  0   Save Mithil Vinay       0.0 0.0 0.0
            1   Abraham Ancy Chandy     6.0 5.0 5.0
            2   Barabde Pranjal Sanjiv  7.0 5.0 5.0
            3   Bari Siddhesh Kishor    8.0 5.0 3.0
            4   Barretto Cleon Domnic   1.0 5.0 4.0

Now I wanted to make a hierarchical column index, so I did it the following way: 现在，我想创建一个分层的列索引，因此我通过以下方式做到了：

big_df = pd.concat([df['Student Name'], df[['Q1', 'Q2', 'Q3']]], axis=1, keys=['Name', 'IS'])

and was able to get the following: 并获得以下信息：

>>> big_df
                Name                    IS
                Student Name            Q1  Q2  Q3
Month   Roll No             
2016-08-01  0   Save Mithil Vinay       0.0 0.0 0.0
            1   Abraham Ancy Chandy     6.0 5.0 5.0
            2   Barabde Pranjal Sanjiv  7.0 5.0 5.0
            3   Bari Siddhesh Kishor    8.0 5.0 3.0
            4   Barretto Cleon Domnic   1.0 5.0 4.0

Now for the second iteration, I want to concatenate only the Q1, Q2, Q3 values from the new dataframe to the big_df dataframe (the previously concatenated dataframe). 现在，对于第二个迭代，我只想将新数据帧中的Q1, Q2, Q3值连接到big_df数据帧（先前连接的数据帧）中。 Now the dataframe for the second iteration is as follows: 现在第二次迭代的数据帧如下：

                Student Name            Q1  Q2  Q3
Month   Roll No             
2016-08-01  0   Save Mithil Vinay       0.0 0.0 0.0
            1   Abraham Ancy Chandy     8.0 5.0 5.0
            2   Barabde Pranjal Sanjiv  7.0 5.0 4.0
            3   Bari Siddhesh Kishor    8.0 4.0 3.0
            4   Barretto Cleon Domnic   2.0 3.0 4.0

I wanted the big_df like the following: 我想要big_df如下所示：

                Name                    IS          CC
                Student Name            Q1  Q2  Q3  Q1  Q2  Q3
Month   Roll No                             
2016-08-01  0   Save Mithil Vinay       0.0 0.0 0.0 0.0 0.0 0.0
            1   Abraham Ancy Chandy     6.0 5.0 5.0 8.0 5.0 5.0
            2   Barabde Pranjal Sanjiv  7.0 5.0 5.0 7.0 5.0 4.0
            3   Bari Siddhesh Kishor    8.0 5.0 3.0 8.0 4.0 3.0
            4   Barretto Cleon Domnic   1.0 5.0 4.0 2.0 3.0 4.0

I tried the following codes, but all are giving error: 我尝试了以下代码，但都给出了错误：

big_df.concat([df[['Q1', 'Q2', 'Q3']]], axis=1, keys=['CC'])

pd.concat([big_df, df[['Q1', 'Q2', 'Q3']]], axis=1, keys=['Name', 'CC'])

Where am I doing the error? 我在哪里出错？ Kindly help. 请帮助。 I am new to Pandas 我是熊猫新手

Answer 1

Drop the topmost level of big_df : 删除最高层的big_df ：

big_df.columns = big_df.columns.droplevel(level=0)

Concatenate them providing three different frames as input matching the number of keys to be used: 将它们连接起来，提供三个不同的框架作为输入，以匹配要使用的键的数量：

Q_cols = ['Q1', 'Q2', 'Q3']
key_names = ['Name', 'IS', 'CC']
pd.concat([big_df[['Student Name']], big_df[Q_cols], df[Q_cols]], axis=1, keys=key_names)

Answer 2

First, you're way better off setting your index to be ['Month', 'Roll no.', 'Student Name'] . 首先，最好将索引设置为['Month', 'Roll no.', 'Student Name'] 。 That will simplify your concat syntaxes a lot and ensure you match on the name of the students too. 这将大大简化您的concat语法，并确保您也匹配学生的姓名。

df.set_index('Student Name', append=True, inplace=True)

Second, I suggest you do it differently and store your df dataframes (with the Q1/Q2/Q3 values) during your iteration with a reference to the name for the highest column level (eg: 'IS', 'CC'). 其次，我建议您以不同的方式进行操作，并在迭代过程中参考最高列级别的名称（例如：“ IS”，“ CC”）存储df数据帧（具有Q1 / Q2 / Q3值）。 A dict would be perfect for this, and pandas does accept a dict as an argument to pd.concat dict对此是完美的，pandas确实接受dict作为pd.concat的参数。

# Creating a dictionnary with the first df from your question
df_dict = {'IS': df}

# Iterate....
   # Append the new df to the df_dict
   df_dict['CC'] = df

Now, after looping through, here's your dict: 现在，遍历完之后，这是您的字典：

df_dict

In [10]: df_dict

Out[10]:
{'CC':                                             Q1   Q2   Q3
 Month      Roll No Student Name                         
 2016-08-01 0       Save Mithil Vinay       0.0  0.0  0.0
            1       Abraham Ancy Chandy     6.0  5.0  5.0
            2       Barabde Pranjal Sanjiv  7.0  5.0  5.0
            3       Bari Siddhesh Kisho     8.0  5.0  3.0
            4       Barretto Cleon Domnic   1.0  5.0  4.0,
 'IS':                                             Q1   Q2   Q3
 Month      Roll No Student Name                         
 2016-08-01 0       Save Mithil Vinay       0.0  0.0  0.0
            1       Abraham Ancy Chandy     8.0  5.0  5.0
            2       Barabde Pranjal Sanjiv  7.0  5.0  4.0
            3       Bari Siddhesh Kisho     8.0  4.0  3.0
            4       Barretto Cleon Domnic   2.0  3.0  4.0}

So now if you concat, pandas does it nicely, and automatically for you: 因此，现在，如果您进行连接，pandas会很好地为您自动完成：

In [11]: big_df = pd.concat(df_dict, axis=1)
         big_df

Out[11]:

If you really wanted to do it iteratively, you should prepend your new multilevel ('CC') before concat with big_df 如果您确实想迭代进行，则应在与big_df连接之前添加新的多级（'CC'）

df.columns = pd.MultiIndex.from_tuples([('IS', x) for x in df.columns])

# Then you can concat, give the same result as the picture above.
pd.concat([big_df, df], axis=1)

熊猫-串联两个多索引数据框

问题描述

2 个解决方案

解决方案1
1 2016-11-07 12:47:24

解决方案2
1 已采纳 2016-11-07 13:36:07

熊猫-串联两个多索引数据框

问题描述

2 个解决方案

解决方案1 1 2016-11-07 12:47:24

解决方案2 1 已采纳 2016-11-07 13:36:07

解决方案1
1 2016-11-07 12:47:24

解决方案2
1 已采纳 2016-11-07 13:36:07