熊猫立柱

Question

my CSV looks like: 我的CSV看起来像：

"a","b","c","d"
1, "x", 1, 1
1, "y", 2, 2

and I want to convert it based on column "b" to 我想根据“ b”列将其转换为

"a", "x_c", "y_c", "x_d", "y_d"
1, 1, 2, 1, 2

I've tried it with pivot and unstack. 我已经尝试过枢轴和拆堆。 Is there a shortcome in pandas ? 熊猫有缺点吗？

EDIT: I have multiple columns therefore I need to append a suffix/prefix 编辑：我有多个列，因此我需要附加一个后缀/前缀

Answer 1

Use pivot_table : 使用pivot_table ：

df = df.pivot_table(index='a',columns='b', values=['c', 'd'], aggfunc=np.mean)
#Multiindex to columns
df.columns = df.columns.map(lambda x: '{}_{}'.format(x[1], x[0]))
df = df.reset_index()
print (df)
   a  x_c  y_c  x_d  y_d
0  1    1    2    1    2

Also if duplicates, then aggfunc is applied: 同样，如果重复，则应用aggfunc：

print (df)
   a  b  c  d
0  1  x  1  1 <-duplicates for 1, x
1  1  y  2  2
2  1  x  4  2 <-duplicates for 1, x
3  2  y  2  3


df = df.pivot_table(index='a',columns='b', values=['c', 'd'], aggfunc=np.mean)
df.columns = df.columns.map(lambda x: '{}_{}'.format(x[1], x[0]))
df = df.reset_index()
print (df)
   a  x_c  y_c  x_d  y_d
0  1  2.5  2.0  1.5  2.0 <-x_c, x_d aggregated mean
1  2  NaN  2.0  NaN  3.0

熊猫立柱

问题描述

1 个解决方案

解决方案1
5 已采纳 2017-03-07 13:59:13

熊猫立柱

问题描述

1 个解决方案

解决方案1 5 已采纳 2017-03-07 13:59:13

解决方案1
5 已采纳 2017-03-07 13:59:13