[英]transpose multiple rows to columns with pandas
I have this excel table read in jupyter notebook with pandas.我在 jupyter notebook 中用 pandas 读取了这个 excel 表。 I want to melt the upper row side of the table into column.
我想将表格的上行侧融合为列。 The table looks like as follow:
该表如下所示:
ori code cgk cgk clg clg
ori city jakarta NaN cilegon NaN
ori prop jakarta NaN banten NaN
ori area jawa NaN jawa NaN
code city district island type a days type b days
001 jakarta jakarta jawa 12000 2 13000 3
002 surabaya surabaya jawa 13000 3 14000 4
I realized that df.melt should be worked to transpose the upper rows, but the type
& days
columns, and also the 4 rows and the NaN value on it get me confuse on how to do that correctly.我意识到 df.melt 应该用来转置上面的行,但是
type
& days
列,还有 4 行和上面的 NaN 值让我对如何正确地做到这一点感到困惑。
The desire clean dataframe I need is as follow:我需要的愿望清洁数据框如下:
code city district island type price_type days ori_code ori_city ori_prop ori_area
001 jakarta jakarta jawa type a 12000 2 cgk jakarta jakarta jawa
001 jakarta jakarta jawa type b 13000 3 clg cilegon banten jawa
002 surabaya surabaya jawa type a 13000 3 cgk jakarta jakarta jawa
002 surabaya surabaya jawa type b 14000 4 clg cilegon banten jawa
The ori_code, ori_city, ori_prop, ori_area
would become column names. ori_code, ori_city, ori_prop, ori_area
将成为列名。
So far what I have done is set fix index name which are code, city, district and also island.到目前为止,我所做的是设置固定索引名称,即代码、城市、地区和岛屿。
df = df.set_index(['code','city','district','island'])
can anyone help me to solve this problem?谁能帮我解决这个问题? Any helps would be much appreciated.
任何帮助将不胜感激。 Thank you in advance.
先感谢您。
For this you can use pandas melt function like this:为此,您可以像这样使用 pandas melt 函数:
import pandas as pd
# Set the index for the DataFrame
df = df.set_index(['code', 'city', 'district', 'island'])
# Use pd.melt to reshape the data
df = pd.melt(df, id_vars=['code', 'city', 'district', 'island'], var_name='type', value_name='price_type')
# Split the 'type' column into two columns: 'type' and 'days'
df[['type', 'days']] = df['type'].str.split(' ', expand=True)
# Drop the 'ori code', 'ori city', 'ori prop', and 'ori area' columns
df = df.drop(columns=['ori code', 'ori city', 'ori prop', 'ori area'])
# Reorder the columns
df = df[['code', 'city', 'district', 'island', 'type', 'price_type', 'days', 'ori_code', 'ori_city', 'ori_prop', 'ori_area']]
# Display the resulting DataFrame
print(df)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.