将交叉表数据从 excel 导入 Pandas 数据框

Question

I have data in excel extracted from IBM Cube in the form of Cross tab.我有从 IBM Cube 以交叉表形式提取的 excel 中的数据。

Crosstab example:
 |Account| Entity| Functions| JAN    |  FEB   | MAR   | JAN      | Feb    |  Mar |
                             Actuals  Actuals  Actuals  Forecast  Forecast Forecast
  A2100    10021    ABS        $200    $300    $270     $230      $270     $250
  A2200    20023    GBS        $320    $285    $360     $350      $300     $400

How to read cross tab in the panda data frame and convert it into the columnar format?如何读取熊猫数据框中的交叉表并将其转换为列格式？ eventually, I want to create functions which can show the differences like (actuals minus forecast) on selection of Month and Functions.最终，我想创建函数来显示选择月份和函数时的差异，例如（实际值减去预测值）。

Disclaimer:- I am new to python, any directions will be helpful.免责声明：- 我是 python 的新手，任何指示都会有所帮助。 I am trying to understand if there is any way to achieve this?我想了解是否有任何方法可以实现这一目标？ I only know simple excel read and csv read which requires data to be in columnar form.我只知道简单的 excel 读取和 csv 读取，这需要数据为柱状形式。

df = pd.read_excel("<path to your file>.xlsx")

final output should look like as suggested by Stef, in addition there should be a column showing variance (Forecast-Actual)最终的 output 应该看起来像 Stef 建议的那样，此外应该有一列显示方差（Forecast-Actual）

Answer 1

Assuming that you already read your data from Excel into a dataframe df you can use melt and merge to unpivot your data like this:假设您已经将数据从 Excel 读取到 dataframe df中，您可以使用melt和merge来取消透视数据，如下所示：

import pandas as pd

data = {'Account': [None, 'A2100', 'A2200'],
 'Entity': [None, '10021', '20023'],
 'Functions': [None, 'ABS', 'GBS'],
 'JAN': ['Actuals', '$200', '$320'],
 'FEB': ['Actuals', '$300', '$285'],
 'MAR': ['Actuals', '$270', '$360'],
 'JAN.1': ['Forecast', '$230', '$350'],
 'Feb': ['Forecast', '$270', '$300'],
 'Mar': ['Forecast', '$250', '$400']}
df = pd.DataFrame(data)

ma = df.iloc[0].ne('Forecast') # mask for actuals
dfa = df.loc[1:,ma.index[ma]]

mf = df.iloc[0].ne('Actuals') # mask for forcasts
dff = df.loc[1:,mf.index[mf]]
dff.columns = dfa.columns

res = pd.melt(dfa, ['Account','Entity','Functions'], 
              var_name='Month', 
              value_name='Actuals').merge(
                  pd.melt(dff, ['Account','Entity','Functions'], 
                          var_name='Month', 
                          value_name='Forecast'))

Result:结果：

  Account Entity Functions Month Actuals Forecast
0   A2100  10021       ABS   JAN    $200     $230
1   A2200  20023       GBS   JAN    $320     $350
2   A2100  10021       ABS   FEB    $300     $270
3   A2200  20023       GBS   FEB    $285     $300
4   A2100  10021       ABS   MAR    $270     $250
5   A2200  20023       GBS   MAR    $360     $400

将交叉表数据从 excel 导入 Pandas 数据框

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-12-26 20:40:06

将交叉表数据从 excel 导入 Pandas 数据框

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-12-26 20:40:06

解决方案1
0 已采纳 2020-12-26 20:40:06