I have 3 tables on each excel sheet: sheet1 - Gross
, sheet2 - Margin
, sheet3 - Revenue
So I was able to iterate through each sheet and unpivot it.
But how can I join them together?
sheet_names = ['Gross','Margin','Revenue']
full_table = pd.DataFrame()
for sheet in sheet_names:
df = pd.read_excel(BudgetData.xlsx', sheet_name = sheet, index=False)
unpvt = pd.melt(df,id_vars=['Company'], var_name ='Month', value_name = sheet)
# how can I join unpivoted dataframes here?
print(unpvt)
Desirable result:
UPDATE:
Thanks @Celius Stingher. I think this is what I need. It just gives me weird sorting:
and gives me this warning:
Sorting because non-concatenation axis is not aligned. A future version
of pandas will change to not sort by default.
To accept the future behavior, pass 'sort=False'.
To retain the current behavior and silence the warning, pass 'sort=True'.
from ipykernel import kernelapp as app
So it seems you are doing the pivoting but not saving each unpivoted dataframe anywhere. Let's create a list of dataframes, that will store each unpivoted dataframe. Later, we will pass that list of dataframes as argument for the pd.concat
function to perform the concatenation.
sheet_names = ['Gross','Margin','Revenue']
list_of_df = []
full_table = pd.DataFrame()
for sheet in sheet_names:
df = pd.read_excel(BudgetData.xlsx', sheet_name = sheet, index=False)
df = pd.melt(df,id_vars=['Company'], var_name ='Month', value_name = sheet)
list_of_df.append(df)
full_df = pd.concat(list_of_df,ignore_index=True)
full_df = full_df.sort_values(['Company','Month'])
print(full_df)
Now that I understand what you need, let's try a different approach. After the loop try the following code instread of the pd.concat
:
full_df = list_of_df[0].merge(list_of_df[1],on=['Company','Month']).merge(list_of_df[2],on=['Company','Month'])
A pd.concat will just pile everything together, you want to actually merge the DataFrames using pd.merge. This works similarly to a SQL Join statement. (based on the 'desired' image in your post)
https://pandas.pydata.org/pandas-docs/version/0.19.1/generated/pandas.DataFrame.merge.html
you just want to use a list of columns to merge on. If you get them all into tidy data frames with the same names as your sheets you would want to do something like:
gross.merge(margin, on=['Company', 'Month']).merge(revenue, on=['Company', 'Month'])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.