[英]importing Cross tab data from excel to Pandas Data frame
I have data in excel extracted from IBM Cube in the form of Cross tab.我有从 IBM Cube 以交叉表形式提取的 excel 中的数据。
Crosstab example:
|Account| Entity| Functions| JAN | FEB | MAR | JAN | Feb | Mar |
Actuals Actuals Actuals Forecast Forecast Forecast
A2100 10021 ABS $200 $300 $270 $230 $270 $250
A2200 20023 GBS $320 $285 $360 $350 $300 $400
How to read cross tab in the panda data frame and convert it into the columnar format?如何读取熊猫数据框中的交叉表并将其转换为列格式? eventually, I want to create functions which can show the differences like (actuals minus forecast) on selection of Month and Functions.
最终,我想创建函数来显示选择月份和函数时的差异,例如(实际值减去预测值)。
Disclaimer:- I am new to python, any directions will be helpful.免责声明:- 我是 python 的新手,任何指示都会有所帮助。 I am trying to understand if there is any way to achieve this?
我想了解是否有任何方法可以实现这一目标? I only know simple excel read and csv read which requires data to be in columnar form.
我只知道简单的 excel 读取和 csv 读取,这需要数据为柱状形式。
df = pd.read_excel("<path to your file>.xlsx")
final output should look like as suggested by Stef, in addition there should be a column showing variance (Forecast-Actual)最终的 output 应该看起来像 Stef 建议的那样,此外应该有一列显示方差(Forecast-Actual)
Assuming that you already read your data from Excel into a dataframe df
you can use melt
and merge
to unpivot your data like this:假设您已经将数据从 Excel 读取到 dataframe
df
中,您可以使用melt
和merge
来取消透视数据,如下所示:
import pandas as pd
data = {'Account': [None, 'A2100', 'A2200'],
'Entity': [None, '10021', '20023'],
'Functions': [None, 'ABS', 'GBS'],
'JAN': ['Actuals', '$200', '$320'],
'FEB': ['Actuals', '$300', '$285'],
'MAR': ['Actuals', '$270', '$360'],
'JAN.1': ['Forecast', '$230', '$350'],
'Feb': ['Forecast', '$270', '$300'],
'Mar': ['Forecast', '$250', '$400']}
df = pd.DataFrame(data)
ma = df.iloc[0].ne('Forecast') # mask for actuals
dfa = df.loc[1:,ma.index[ma]]
mf = df.iloc[0].ne('Actuals') # mask for forcasts
dff = df.loc[1:,mf.index[mf]]
dff.columns = dfa.columns
res = pd.melt(dfa, ['Account','Entity','Functions'],
var_name='Month',
value_name='Actuals').merge(
pd.melt(dff, ['Account','Entity','Functions'],
var_name='Month',
value_name='Forecast'))
Result:结果:
Account Entity Functions Month Actuals Forecast
0 A2100 10021 ABS JAN $200 $230
1 A2200 20023 GBS JAN $320 $350
2 A2100 10021 ABS FEB $300 $270
3 A2200 20023 GBS FEB $285 $300
4 A2100 10021 ABS MAR $270 $250
5 A2200 20023 GBS MAR $360 $400
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.