繁体   English   中英

熊猫获取行值索引

[英]pandas get index for a row value

我正在使用以下数据框

 like            max_interest    min_interest
 basketball       4               2
 football         2               0
 soccer           4               2
 softball         4               2
 volleyball       4               2
 swimming         2               0
 cheerleading     4               2
 baseball         4               2

我想按max_interest / min兴趣将其分组

  group         max_interest                                                  min_interest
      4         basketball,soccer,softball,volleyball,cheerleading,baseball   N/A   
      2         football,swimming                                             basketball,soccre,softball,volleyball,cheerleading,baseball
      0         N/A                                                           football,swimming

我试图通过使用groupby(max_interest)使其工作,但未能找到如何合并like列的方法

这本质上是在max_interest标题下将likes的行值合并为字符串,对于mininterest也是类似的。

可以通过编写iterateng的手编码逻辑并不断添加喜欢的东西来实现,但是想知道我是否可以使用pandas / np库编写它

帮助赞赏。

这是一个选择:

In [39]: def groupby(key):
   ....:         result = data.groupby(key).agg({'like': lambda v: ','.join(v)})
   ....:         result.index.name = 'group'
   ....:         result.columns = [key]
   ....:         return result
   ....:

In [40]: pd.concat((groupby(key) for key in ['max_interest', 'min_interest']), axis=1)
Out[40]:
                                            max_interest                                       min_interest
group
0                                                    NaN                                  football,swimming
2                                      football,swimming  basketball,soccer,softball,volleyball,cheerlea...
4      basketball,soccer,softball,volleyball,cheerlea...                                                NaN

首先拆分DataFrame并根据兴趣级别连接适当的DataFrame

u = ({k: ','.join(n['like'])} for k, n in df.groupby('max_interest'))              
v = ({k: ','.join(n['like'])} for k, n in df.groupby('min_interest'))

然后创建一个新的DataFrame

df1 = pd.DataFrame(list(u)+list(v), index=['max_interest', 'max_interest', 'min_interest', 'min_interest']

以所需的形式放置框架,使用groupby()。last()

adjustframe = df1.grouby(level=0).last().transpose()

输出:

                            max_interest                          min_interest                                                                      
0                                   NaN                             foot,swim                                                                      
2                             foot,swim  basket,soccer,soft,volley,cheer,base                                                                      
4  basket,soccer,soft,volley,cheer,base                                   NaN                                                                      

设置索引名称:

adjustframe.index.name = 'group'                                    

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM