在字典上合并熊猫数据框

Question

I have to dataframes that are related via a hierarchical dictionary. 我必须使用通过分层字典关联的数据框。

In[0]: import pandas as pd

d = {'levelA_1':['sublevel_1', 'sublevel_2'],
     'levelA_2':['sublevel_3', 'sublevel_4'],
     'levelA_3':['sublevel_5', 'sublevel_6']}

datA = pd.DataFrame({'A': {'levelA_1': 4, 'levelA_2': 2, 'levelA_3': 2},
                     'B': {'levelA_1': 1, 'levelA_2': 3, 'levelA_3': 5},
                     'C': {'levelA_1': 2, 'levelA_2': 4, 'levelA_3': 6}})

datB = pd.DataFrame({'A': {'sublevel_1': 4, 'sublevel_2': 1, 'sublevel_3': 3, 'sublevel_4': 4},
                     'B': {'sublevel_1': 1, 'sublevel_2': 3, 'sublevel_3': 4, 'sublevel_4': 8},
                     'C': {'sublevel_1': 2, 'sublevel_2': 6, 'sublevel_3': 13, 'sublevel_4': 6}})

In[1]: datA
Out[1]:     
            A   B   C
levelA_1    4   1   2
levelA_2    2   3   4
levelA_3    2   5   6

In[2]: datB
Out[2]:
            A   B   C
sublevel_1  4   1   2
sublevel_2  1   3   6
sublevel_3  3   4   13
sublevel_4  4   8   6

In[3]: x = 3

The first dataframe ( datA ) provides values for the keys of d and the other ( datB ) provides values for the values of d . 第一个数据帧（ datA ）为d的键提供值，另一个（ datB ）为d的值提供值。

Furthermore I have a base value of x . 此外，我的基本值为x 。 I want to multiply the matrix of datA with x and then each element of datB with the referenced value (from the dict). 我想将datA的矩阵乘以x ，然后将datB每个元素与引用的值（来自dict） datB 。

So for example I want to get the following result for a cell. 因此，例如，我想获得一个单元格的以下结果。

    x = 3
    3 * datB['B']['sublevel_3'] * datA['B']['levelA_2'] 

    # res = 3*4*3 = 36

Desired output for dataframe: 数据帧的所需输出：

            A   B   C
sublevel_1  48  3   12
sublevel_2  12  9   26
sublevel_3  18  36  156
sublevel_4  24  72  72

Is there a better way than to loop through each cell? 有没有比在每个单元格中循环更好的方法？

Answer 1

IIUC 联合会

datA['New']=datA.reset_index()['index'].map(d).values 
# map the dict , build the connecction for datA and datB
New_datA=datA.set_index(list('ABC'),append=True).New.apply(pd.Series).stack().reset_index(list('ABC'))
# makeing datA and datB have the same index, then we could do dataframe calculation
New_datA=New_datA.set_index(0)

datB*New_datA*3
#you can add dropna at the end to remove the NaN value
Out[95]: 
               A     B      C
sublevel_1  48.0   3.0   12.0
sublevel_2  12.0   9.0   36.0
sublevel_3  18.0  36.0  156.0
sublevel_4  24.0  72.0   72.0
sublevel_5   NaN   NaN    NaN
sublevel_6   NaN   NaN    NaN

在字典上合并熊猫数据框

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-01-01 20:54:57

在字典上合并熊猫数据框

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-01-01 20:54:57

解决方案1
0 已采纳 2018-01-01 20:54:57