[英]How to efficiently melt multiple columns using the module melt in Pandas?
The objective is to unpivot the following table目标是对下表进行反透视
Activity General m1 t1 m2 t2 m3 t3
0 P1 AA A1 TA1 A2 TA2 A3 TA3
1 P2 BB B1 TB1 B2 TB2 B3 TB3
into the following format变成以下格式
Activity General M Task
0 P1 AA A1 TA1
1 P1 AA A2 TA2
2 P1 AA A3 TA3
3 P2 BB B1 TB1
4 P2 BB B2 TB2
5 P2 BB B3 TB3
Based on some reading, the module melt
can be used to achieved the desired objective.根据一些读数,模块
melt
可用于实现预期目标。
import pandas as pd
from pandas import DataFrame
list_me = [['P1','AA','A1','TA1','A2','TA2','A3','TA3'],
['P2', 'BB', 'B1', 'TB1', 'B2', 'TB2', 'B3', 'TB3']]
df = DataFrame (list_me)
df.columns = ['Activity','General','m1','t1','m2','t2','m3','t3']
melted_form=pd.melt(df, id_vars=['Activity','General'],var_name='m1',value_name='new_col')
However, most of the example found on the net was solely to tackle single column.但是,在网上找到的大多数示例仅用于处理单列。 I am thinking of using for loop to loop the
m1 m2
and m3
and merge the result concurrently.我正在考虑使用 for 循环来循环
m1 m2
和m3
并同时合并结果。 This is because, in actually, the pair of m_i and t_i is at the range of hundreds (where i is the index)这是因为,实际上,m_i 和 t_i 对的范围是数百(其中 i 是索引)
But, I wonder there are more efficient approach than looping.但是,我想知道有比循环更有效的方法。
ps I had tried suggestion as in the OP , but, it does not give the intended output ps 我已经尝试过OP 中的建议,但是,它没有给出预期的输出
If I understand your question, you could use pd.wide_to_long :如果我理解你的问题,你可以使用pd.wide_to_long :
(pd.wide_to_long(df,
i=["Activity", "General"],
stubnames=["t", "m"], j="number")
.set_axis(["Task", "M"], axis="columns")
.droplevel(-1).reset_index()
)
Activity General Task M
0 P1 AA TA1 A1
1 P1 AA TA2 A2
2 P1 AA TA3 A3
3 P2 BB TB1 B1
4 P2 BB TB2 B2
5 P2 BB TB3 B3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.