[英]Pandas converting repeated columns as rows
I have a dataframe like this with repeating column names: ID is loaded as index我有一个像这样重复列名的数据框:ID 作为索引加载
JANUARY FEBRUARY MARCH
ID Sales Revenue Sales Revenue Sales Revenue
03 10.00 5.00 0.00 0.00 10.00 19.00
05 20.00 20.00 20.00 20.00 20.00 20.00
06 30.00 30.00 30.00 30.00 30.00 30.00
07 30.00 30.00 30.00 30.00 30.00 30.00
I want to show it as below:我想显示如下:
ID Sales Revenue
03 10.00 5.00
05 20.00 20.00
06 30.00 30.00
07 30.00 30.00
03 0.00 0.00
05 20.00 20.00
06 30.00 30.00
07 30.00 30.00
03 10.00 19.00
05 20.00 20.00
06 30.00 30.00
07 30.00 30.00
Currently I'm using, but expecting a better way.目前我正在使用,但期待更好的方法。 I have tried melt, but that's only for one column:
我尝试过融化,但这仅适用于一列:
cols = df.columns.to_list()
for i in range(1, len(cols), 2): # #Loading each month's data to the data frame
sub_cols = cols[i:i + 2]
sub_cols .insert(0, cols[0])
sub_df = df.filter(sub_cols , axis=1)
sub_df.columns = ['ID', 'Revenue', 'Sales']
if i == 1:
final_df = sub_df
else:
final_df = final_df.append(sub_df)
Here's another way to stack the columns.这是堆叠列的另一种方法。 Not sure if it's more efficient but it takes less code.
不确定它是否更有效,但它需要更少的代码。
# JANUARY FEBRUARY MARCH
# ID Sales Revenue Sales Revenue Sales Revenue
# 03 10.00 5.00 0.00 0.00 10.00 19.00
# 05 20.00 20.00 20.00 20.00 20.00 20.00
# 06 30.00 30.00 30.00 30.00 30.00 30.00
# 07 30.00 30.00 30.00 30.00 30.00 30.00
import pandas as pd
dd = {
'ID':['03','05','06','07'],
'Sales1':[10,20,30,30],
'Rev1':[5,20,30,30],
'Sales2':[0,20,30,30],
'Rev2':[0,20,30,30],
'Sales3':[10,20,30,30],
'Rev3':[19,20,30,30]
}
df = pd.DataFrame(dd)
print(df.to_string(index=False),'\n') # source dataframe
####################
dfnew = pd.DataFrame(columns = ['ID', 'Sales', 'Revenue']) # new dataframe with all data
for c in range(1,len(df.columns),2):
dftmp = df[['ID',df.columns[c],df.columns[c+1]]] # create df for each month
dftmp.columns = ['ID', 'Sales', 'Revenue'] # must rename columns for append
dfnew = dfnew.append(dftmp) # append to stacked df
print(dfnew.to_string(index=False))
Output输出
ID Sales1 Rev1 Sales2 Rev2 Sales3 Rev3
03 10 5 0 0 10 19
05 20 20 20 20 20 20
06 30 30 30 30 30 30
07 30 30 30 30 30 30
ID Sales Revenue
03 10 5
05 20 20
06 30 30
07 30 30
03 0 0
05 20 20
06 30 30
07 30 30
03 10 19
05 20 20
06 30 30
07 30 30
Pandas lreshape did the trick for me. Pandas lreshape 帮我解决了这个问题。
df = pd.lreshape(df,
{'Sales': file_df.columns[file_df.columns.str.match(r'^Sales\.?\d?')],
'Revenue': file_df.columns[file_df.columns.str.match(r'^Revenue\.?\d?')]})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.