[英]Python Pandas Groupby/Append columns
This is my example dataframe: 这是我的示例数据框:
Index Param1 Param2
A 1 2
A 3 4
B 1 3
B 4 Nan
C 2 4
What I would like to get is: 我想得到的是:
Index Param1 Param2 Param3 Param4
A 1 2 3 4
B 1 3 4
C 2 4
What would be the best way to achieve it using pandas? 用熊猫实现它的最佳方法是什么? Thanks in advance for your help. 在此先感谢您的帮助。
You can use groupby
with unstack
: 您可以使用groupby
与unstack
:
def f(x):
return (pd.DataFrame(np.sort(x.values.ravel())))
df = df.groupby('Index')['Param1','Param2'].apply(f).unstack()
df.columns = df.columns.droplevel(0)
print (df)
0 1 2 3
Index
A 1 2 3 4
B 1 3 4 Nan
C 2 4 None None
because if use Series
get: 因为如果使用Series
得到:
TypeError: Series.name must be a hashable type TypeError:Series.name必须是可哈希的类型
Another solution with cumcount
: cumcount
另一种解决方案:
df = df.set_index('Index').stack().reset_index(name='vals')
df['g'] = 'Param' + df.groupby('Index').cumcount().add(1).astype(str)
df = df.pivot(index='Index', columns='g', values='vals')
print (df)
g Param1 Param2 Param3 Param4
Index
A 1.0 2.0 3.0 4.0
B 1.0 3.0 4.0 NaN
C 2.0 4.0 NaN NaN
import numpy as np
import pandas as pd
df = pd.DataFrame({'Index': ['A', 'A', 'B', 'B', 'C'], 'Param1': [1, 3, 1, 4, 2],
'Param2': [2, 4, 3, np.nan, 4]}).set_index('Index')
print(df)
# Param1 Param2
# Index
# A 1 2.0
# A 3 4.0
# B 1 3.0
# B 4 NaN
# C 2 4.0
def fn(g):
return pd.Series(g.values.ravel())
res = df.groupby(df.index).apply(fn).unstack()
res.columns = ['Param1', 'Param2', 'Param3', 'Param4']
print(res)
# Param1 Param2 Param3 Param4
# Index
# A 1.0 2.0 3.0 4.0
# B 1.0 3.0 4.0 NaN
# C 2.0 4.0 NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.