[英]Splice and combine two columns to form a new data frame (Pandas)
I need to convert my pandas dataframe into a little odd list. 我需要将我的pandas数据框转换为一个奇怪的列表。 I have the following example pandas dataframe:
我有以下示例熊猫数据框:
Input Dataframe: 输入数据框:
mydf= pd.DataFrame.from_dict({'ARS':['xx2','xx3','xx1'], 'xyz':['yy1','xx2','xx3'], 'ppp':['xx3','yy2','xx2']}, orient='columns')
mydf= mydf.stack().reset_index()
mydf.columns= ['list1','list2','list3']
newdf= mydf[['list2','list3']]
newdf
list2 list3
0 ARS xx2
1 ppp xx3
2 xyz yy1
3 ARS xx3
4 ppp yy2
5 xyz xx2
6 ARS xx1
7 ppp xx2
8 xyz xx3
Desired Dataframe: 所需的数据框:
>ARS
xx2
xx3
xx1
>ppp
xx3
yy2
xx2
>xyz
yy1
xx2
xx3
Does anyone have a simple pandas way to convert this? 有没有人有简单的熊猫方式来转换它?
Here is my attempt: 这是我的尝试:
In [173]: v = np.concatenate(
...: pd.DataFrame(
...: newdf.groupby('list2')['list3'].apply(lambda x: [x.name] + x.values.tolist()))
...: .values
...: .reshape(-1,)
...: )
In [174]: pd.DataFrame({'col':v})
Out[174]:
col
0 ARS
1 xx2
2 xx3
3 xx1
4 ppp
5 xx3
6 yy2
7 xx2
8 xyz
9 yy1
10 xx2
11 xx3
PS I'm sure there must be much more elegant solution. 附言:我敢肯定必须有更优雅的解决方案。
Here's a Pandas way using groupby
, pd.concat
with indexing: 这是使用
groupby
, pd.concat
和索引的Pandas方法:
(newdf.groupby('list2',as_index=False)
.apply(lambda x: pd.concat([pd.Series(x.iloc[0]['list2']),
pd.Series(x.loc[:,'list3'])]))
.reset_index(drop=True))
Output: 输出:
0 ARS
1 xx2
2 xx3
3 xx1
4 ppp
5 xx3
6 yy2
7 xx2
8 xyz
9 yy1
10 xx2
11 xx3
dtype: object
If you really wanted that '>' sign use the follow: 如果您确实希望使用“>”符号,请使用以下命令:
(newdf.groupby('list2',as_index=False)
.apply(lambda x: pd.concat([pd.Series('>'+x.iloc[0]['list2']),
pd.Series(x.loc[:,'list3'])]))
.reset_index(drop=True))
Output: 输出:
0 >ARS
1 xx2
2 xx3
3 xx1
4 >ppp
5 xx3
6 yy2
7 xx2
8 >xyz
9 yy1
10 xx2
11 xx3
dtype: object
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.