[英]Python Pandas: Convert columns to rows
我有以下 Dataframe:
ORD_DATE Weekday ALL Sales ALL Pred Other Sales Other Pred
1/1/2020 2 48386.27 20815269.83 15386.27 5643158.509
我會有大約。 1000 種不同的銷售/預售組合。 我想要這樣:
ORD_DATE Weekday Sales Pred Filter
1/1/2020 2 48386.27 20815269.83 All
1/1/2020 2 15386.27 5643158.509 Other
我試過轉置,但沒有得到我想要的結果。
為此,首先將所有未更改的列移動到索引中,然后拆分剩余的列名並從它們的組件中創建一個 MultiIndex,最后stack
其中一個列級別。
df = pd.DataFrame.from_records(
[
{
"ORD_DATE": "1/1/2020",
"Weekday": 2,
"ALL_Sales": 48386.27,
"ALL_Pred": 20815269.83,
"Other_Sales": 15386.27,
"Other_Pred": 5643158.509,
}
]
)
df = df.set_index(["ORD_DATE", "Weekday"])
split_cols = [c.split("_") for c in df.columns]
df.columns = pd.MultiIndex.from_tuples(split_cols, names=["Filter", None])
df = df.stack("Filter").reset_index()
print(df)
ORD_DATE Weekday Filter Pred Sales
0 1/1/2020 2 ALL 2.081527e+07 48386.27
1 1/1/2020 2 Other 5.643159e+06 15386.27
我會使用 pandas.concat 和切片來做到這一點:
import pandas as pd
df = pd.DataFrame({
'ORD_DATE':['1/1/2020'],
'Weekday':[2],
'ALL Sales':[48386.27],
'ALL Pred':[20815269.83],
'Other Sales':[15386.27],
'Other Pred':[5643158.509],
})
df2 = pd.concat([
df[['ORD_DATE','Weekday','ALL Sales','ALL Pred']].rename(columns={
'ALL Sales':'Sales',
'ALL Pred':'Pred',
}),
df[['ORD_DATE','Weekday','Other Sales','Other Pred']].rename(columns={
'Other Sales':'Sales',
'Other Pred':'Pred',
}),
])
print(df2)
Output:
ORD_DATE Weekday Sales Pred
0 1/1/2020 2 48386.27 2.081527e+07
0 1/1/2020 2 15386.27 5.643159e+06
編輯添加了過濾器列:
import pandas as pd
df = pd.DataFrame({
'ORD_DATE':['1/1/2020'],
'Weekday':[2],
'ALL Sales':[48386.27],
'ALL Pred':[20815269.83],
'Other Sales':[15386.27],
'Other Pred':[5643158.509],
})
df_all = df[['ORD_DATE','Weekday','ALL Sales','ALL Pred']].rename(columns={
'ALL Sales':'Sales',
'ALL Pred':'Pred',
})
df_all['Filter'] = 'All'
df_other = df[['ORD_DATE','Weekday','Other Sales','Other Pred']].rename(columns={
'Other Sales':'Sales',
'Other Pred':'Pred',
})
df_other['Filter'] = 'Other'
df2 = pd.concat([df_all,df_other,])
print(df2)
Output:
ORD_DATE Weekday Sales Pred Filter
0 1/1/2020 2 48386.27 2.081527e+07 All
0 1/1/2020 2 15386.27 5.643159e+06 Other
調整列名中ALL和Other的位置,然后使用pandas的wide to long方法重塑dataframe。
#reshape column position
#allows manipulation with pd wide to long
df.columns = [' '.join(reversed(ent.split()))
if ('ALL' in ent) or ('Other' in ent) else ent
for ent in df.columns ]
df
ORD_DATE Weekday Sales ALL Pred ALL Sales Other Pred Other
0 1/1/2020 2 48386.27 20815269.83 15386.27 5643158.509
#reshape data using pandas wide to long
res = (pd.wide_to_long(df,
stubnames = ['Sales','Pred'],
i = ['ORD_DATE','Weekday'],
j = 'Filter',
sep = ' ',
suffix = '[a-zA-Z]+'
)
.reset_index()
)
res
ORD_DATE Weekday Filter Sales Pred
0 1/1/2020 2 ALL 48386.27 2.081527e+07
1 1/1/2020 2 Other 15386.27 5.643159e+06
受@BallPointPen 解決方案啟發的另一條路線:
#set first two columns as index
df = df.set_index(['ORD_DATE','Weekday'])
#split column on whitespace
#using the str partition creates a multiindex
#drop the 1st level, since it contains the whitespace, and is not needed
#assign back to columns
df.columns = df.columns.str.partition(' ').droplevel(1)
#stack the dataframe
res = df.stack(0).reset_index().rename(columns={'level_2':'Filter'})
res
ORD_DATE Weekday Filter Pred Sales
0 1/1/2020 2 ALL 2.081527e+07 48386.27
1 1/1/2020 2 Other 5.643159e+06 15386.27
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.