[英]Convert dataframe column values to new columns
I have a dataframe containing some data, which I want to transform, so that the values of one column define the new columns. 我有一个包含一些数据的数据框,我想要转换,以便一列的值定义新列。
>>> import pandas as pd
>>> df = pd.DataFrame([['a','a','b','b'],[6,7,8,9]]).T
>>> df
A B
0 a 6
1 a 7
2 b 8
3 b 9
The values of the column A
shall be the column names of the new dataframe. 列
A
的值应为新数据帧的列名。 The result of the transformation should look like this: 转换的结果应如下所示:
a b
0 6 8
1 7 9
What I came up with so far didn't work completely: 到目前为止我想出的并没有完全发挥作用:
>>> pd.DataFrame({ k : df.loc[df['A'] == k, 'B'] for k in df['A'].unique() })
a b
0 6 NaN
1 7 NaN
2 NaN 8
3 NaN 9
Besides this being incorrect, I guess there probably is a more efficient way anyway. 除了这是不正确的,我想无论如何可能还有更有效的方法。 I'm just really having a hard time understanding how to handle things with pandas.
我真的很难理解如何处理大熊猫的事情。
You were almost there but you need the .values
as the list of array and then provide the column names. 你几乎就在那里,但你需要
.values
作为数组列表,然后提供列名。
pd.DataFrame(pd.DataFrame({ k : df.loc[df['A'] == k, 'B'].values for k in df['A'].unique() }), columns=df['A'].unique())
Output: 输出:
a b
0 6 8
1 7 9
Use set_index
, groupby
, cumcount
, and unstack
: 使用
set_index
, groupby
, cumcount
,并unstack
:
(df.set_index(['A', df.groupby('A').cumcount()])['B']
.unstack(0)
.rename_axis([None], axis=1))
Output: 输出:
a b
0 6 8
1 7 9
Using a dictionary comprehension with groupby
: 使用
groupby
的字典理解:
res = pd.DataFrame({col: vals.loc[:, 1].values for col, vals in df.groupby(0)})
print(res)
a b
0 6 8
1 7 9
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.