相对于列表计算pandas数据框中的出现次数

Question

I am trying to create a barchart of element frequencies using matplotlib. 我正在尝试使用matplotlib创建元素频率的条形图。 In order to accomplish this, I need to be able to count the amount of occurrences in a pandas dataframe column with respect to a list of flags. 为了做到这一点，我需要能够计算相对于标志列表的pandas dataframe列中的出现次数。 Below will give a rough sketch of the code I have in my notebook/data: 下面将概述我的笔记本/数据中的代码：

   # list of filtered values 
   filtered = [200, 201, 201, 201, 201, 201, 
   211, 211, 211, 211, 211, 211, 211, 211, 211, 211, 211, 211, 211, 211, 
   237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 
   237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 
   237, 237, 237, 237, 237, 237, 237, 237, 250, 250, 250, 250, 250, 250, 250,
   250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 
   250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250,
   250, 250, 250, 250, 254]

   # list of flags to use for filtering 
   flags = [200, 201, 211, 237, 239, 250, 254, 255]
   # this was just a line to code for testing
   flags_dict = {200:0,201:0,211:0,237:0,239:0,250:0,254:0,255:0}

   freq = filtered.value_counts()


   """
   Expected flags_dict:
   200: 1
   201: 5
   211: 14
   237: 38
   239: 0
   250: 40
   254: 1
   255: 0
   """

   """
   These are the values from the real dataframe but they do not take into 
   account the other flags in the flags list
   freq: 
   250.0    7682
   211.0    3734
   200.0    1483
   239.0     180
   201.0      34       
   """

Answer 1

This can be answered fairly straightforward with isin 可以用isin相当简单地回答

Assuming filtered is a Series. 假设filtered是一个系列。

In [1]: filtered[filtered.isin(flags)].value_counts().reindex(flags, fill_value=0)
Out[1]: 200     1
        201     5
        211    14
        237    38
        239     0
        250    41
        254     1
        255     0
        dtype: int64

To get a dictionary just add to_dict 要获得字典，只需添加to_dict

In [2]: filtered[filtered.isin(flags)].value_counts().reindex(flags, fill_value=0).to_dict()

Out[2]: {200: 1, 201: 5, 211: 14, 237: 38, 239: 0, 250: 41, 254: 1, 255: 0}

Answer 2

I came up with this just now, but there has to be a better/faster way to accomplish this 我刚刚想出了这个方法，但是必须有一种更好/更快的方法来完成该任务

      #column_data is a list created from a pandas Dataframe column 
      column_data = list(filtered['C5 Terra'])
      flags_dict[200] = column_data.count(200)
      flags_dict[201] = column_data.count(201)
      flags_dict[211] = column_data.count(211)
      flags_dict[237] = column_data.count(237)
      flags_dict[239] = column_data.count(239)
      flags_dict[250] = column_data.count(250)
      flags_dict[254] = column_data.count(254)
      flags_dict[255] = column_data.count(255)
      flags_dict

Answer 3

If I understood correctly this is what you need: 如果我理解正确，这就是您所需要的：

import pandas as pd

filtered = [200, 201, 201, 201, 201, 201, 211, 211, 211, 211, 211, 211, 211, 211, 211,
            211, 211, 211, 211, 211, 
            237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 
            237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 237, 
            237, 237, 237, 237, 237, 237, 237, 237, 250, 250, 250, 250, 250, 250, 250,
            250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 
            250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250, 250,
            250, 250, 250, 250, 254]


filtered = pd.Series(filtered)

freq = filtered.value_counts(sort=False)
flags = [200, 201, 211, 237, 239, 250, 254, 255]
flags_dict = {}
for flag in flags:
    try:
        flags_dict[flag] = freq[flag]
    except:
        flags_dict[flag] = 0

相对于列表计算pandas数据框中的出现次数

问题描述

3 个解决方案

解决方案1
1 2016-12-06 00:06:55

解决方案2
0 2016-12-05 22:18:35

解决方案3
0 2016-12-05 22:19:21

相对于列表计算pandas数据框中的出现次数

问题描述

3 个解决方案

解决方案1 1 2016-12-06 00:06:55

解决方案2 0 2016-12-05 22:18:35

解决方案3 0 2016-12-05 22:19:21

解决方案1
1 2016-12-06 00:06:55

解决方案2
0 2016-12-05 22:18:35

解决方案3
0 2016-12-05 22:19:21