简体   繁体   English

将2D值数组分配给Pandas多索引数据框

[英]Assigning a 2d array of values to a Pandas multiindex dataframe

I have outputs from calculations that is best stored in a Pandas MultiIndex format. 我的计算结果最好以Pandas MultiIndex格式存储。 For concrete purposes, let us consider the form below (though the actual structure is dictated programmatically) 出于具体目的,让我们考虑以下形式(尽管实际结构是通过编程确定的)

                        X         Y         Z       
DATE                                                      
2018-01-01 A           NaN       NaN       NaN      
           B           NaN       NaN       NaN      
           C           NaN       NaN       NaN      
2018-01-02 A           NaN       NaN       NaN       
           B           NaN       NaN       NaN      
           C           NaN       NaN       NaN       

I want to assign the numpy array outputs to a particular time slice. 我想将numpy数组输出分配给特定的时间片。 Say I have 说我有

output = np.array([[1,2,3],[2,2,1],[4,2,3]])

so the desired output is 所以所需的输出是

                        X         Y         Z       
DATE                                                      
2018-01-01 A           NaN       NaN       NaN      
           B           NaN       NaN       NaN      
           C           NaN       NaN       NaN      
2018-01-02 A             1         2         3       
           B             2         2         1   
           C             4         2         3   

I have tried pandas.IndexSlice where j is the j-th time slice. 我尝试了pandas.IndexSlice,其中j是第j个时间片。

df.loc[pd.IndexSlice[j,:], :] = output

but that doesn't work. 但这不起作用。 I have also tried by replacing loc by iloc but to no avail. 我也尝试过用loc代替loc,但无济于事。 In non-MultiIndex dataframes, I can assign a list to a particular column in a DataFrame without having to assign each element individually. 在非MultiIndex数据框中,我可以将列表分配给DataFrame中的特定列,而不必分别分配每个元素。 Is there a way to do it for a matrix into a MultiIndex dataframe? 有没有一种方法可以将矩阵转换为MultiIndex数据帧?

your code works just fine. 您的代码工作正常。

Demo: 演示:

In [70]: df.loc[pd.IndexSlice['2018-01-02', :], :] = output

In [71]: df
Out[71]:
                 X    Y    Z
DATE       I2
2018-01-01 A   NaN  NaN  NaN
           B   NaN  NaN  NaN
           C   NaN  NaN  NaN
2018-01-02 A   1.0  2.0  3.0
           B   2.0  2.0  1.0
           C   4.0  2.0  3.0

PS i tested both options when the DATE index column is of string and when it's is of datetime dtype - in both cases the code above is working properly. PS i在DATE索引列为stringdatetime dtype时测试了这两个选项-在上述两种情况下,以上代码均能正常工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM