根据最高编号将类分配给ID。子类 - 熊猫

Question

So actually I have a dataframe with some videoID under which there is a chain of videos with subcategories and I want to assign the highest occurring class. 所以实际上我有一个带有一些videoID的数据videoID ，其中有一系列带有子类别的视频，我想分配最高的类。 So my dataframe looks like this, 所以我的数据框看起来像这样，

videoId   postId   class

12234     788         1
12234     789         1
12234     790         3
12234     791         4
12234     792         1
12234     793         4

So I want a dataframe like this for every such videoId: 所以我希望每个这样的videoId都有这样的数据帧：

videoId   class
  12234      1

Since highest occurring class is 1 (counting he subposts classes) under that videoId 由于在videoId下最高出现的类是1（计算他的子目录类）

Now suppose if I have a tie between the classes say like this: 现在假设我在这些类之间有一个联系如下：

videoId   postId   class

1620      34          1
1620      35          1
1620      36          2
1620      37          2

I want it to be like this: 我希望它是这样的：

 videoId  class
 1620      1
 1620      2

So when, there is a tie between the subclasses I want all of them to appear for that videoId . 所以，当子类之间存在联系时，我希望它们全部出现在该videoId 。 I have tried several w ays, by doing value_counts() , max() , etc. but was not able to reach to the solution. 我通过执行value_counts() ， max()等尝试了几个问题但是无法达到解决方案。

Answer 1

You can simply apply mode over groupby and reset index ie 您可以简单地将mode应用于groupby并重置索引即

df.groupby('videoId')['class'].apply(pd.Series.mode).reset_index(level=0)

  videoId  class
0     1620      1
1     1620      2

Answer 2

One way to do this is to use dense ranking: 一种方法是使用密集排名：

df.groupby('videoId')['class'].value_counts()\
  .rank(method='dense',ascending=False)\
  .rename('ranking')\
  .reset_index()\
  .query('ranking == 1')

Output: 输出：

   videoId  class  ranking
0     1620      1      1.0
1     1620      2      1.0

根据最高编号将类分配给ID。子类 - 熊猫

问题描述

2 个解决方案

解决方案1
2 已采纳 2018-09-23 04:18:33

解决方案2
1 2018-09-23 04:14:07

根据最高编号将类分配给ID。 子类 - 熊猫

问题描述

2 个解决方案

解决方案1 2 已采纳 2018-09-23 04:18:33

解决方案2 1 2018-09-23 04:14:07

根据最高编号将类分配给ID。子类 - 熊猫

解决方案1
2 已采纳 2018-09-23 04:18:33

解决方案2
1 2018-09-23 04:14:07