简体   繁体   English

从与来自不同数据框的列索引值相对应的列表中创建一个新的计数数据框

[英]Create a new data frame of counts from a list corresponding to column index values from a different data frame

I have two unique lists like:我有两个独特的列表,例如:

a = [12, 12, 12, 3, 4, 5]
b = [1, 2, 4, 5, 6, 12, 4, 7, 9, 2, 3, 5, 6]

df.columns
Index(['lep', 'eta', 'phi', 'missing energy magn', 'missing energy phi', 'jet]) etc

(columns is longer than I wrote here, obviously). (显然,专栏比我在这里写的要长)。

Lists a and b correspond to column index values.列表 a 和 b 对应于列索引值。

I want to create a new dataframe with the following columns:我想用以下列创建一个新的 dataframe:

Index Value, Column Name, Count of a, Count of b

So if '12' appears three times in list a,, twice in list b and corresponds to 'foo' in df, the corresponding row in the new dataframe would return: 12, foo, 3, 2因此,如果'12'在列表a中出现三次,在列表b中出现两次并且对应于df中的'foo',则新的dataframe中的相应行将返回:12, foo, 3, 2

I know how to get counts of list values by four loops but I'm not sure how to target the index value我知道如何通过四个循环获取列表值的计数,但我不确定如何定位索引值

Desired output would be something like (based on the data above):所需的 output 类似于(基于上述数据):

new_df.head() new_df.head()

-index, name, count_a, count_b -索引、名称、count_a、count_b

  • 0 lep, 0, 0 0 点, 0, 0
  • 1, eta, 0, 1 1, eta, 0, 1
  • 2, phi, 0, 2 2, φ, 0, 2
  • 3, missing energy mag, 1, 1 3, 缺少能量弹匣, 1, 1
  • 4 massing energy phi, 1, 1 4 聚集能量 phi, 1, 1

You can use count() to count number of occurences of element in list and enumerate() to iterate over list and keep a count of iterations.您可以使用count()来计算列表中元素出现的次数,并使用enumerate()来遍历列表并记录迭代次数。

So your code becomes:所以你的代码变成:

import pandas as pd

a = [12, 12, 12, 3, 4, 5]
b = [1, 2, 4, 5, 6, 12, 4, 7, 9, 2, 3, 5, 6]


elements = ['lep', 'eta', 'phi', 'missing energy magn', 'missing energy phi', 'jet']  # it is not necessary to declare it as an index, it is a simple list

df_data = []
for i, el in enumerate(elements):

    tmp_dict = {
        'name': el,
        'count_a': a.count(i),
        'count_b': b.count(i)
    }

    df_data.append(tmp_dict)

df = pd.DataFrame(df_data)
print(df)

Output will be: Output 将是:

                  name  count_a  count_b
0                  lep        0        0
1                  eta        0        1
2                  phi        0        2
3  missing energy magn        1        1
4   missing energy phi        1        2
5                  jet        1        2

This approach works regardless of the number of elements in the list obviously.显然,无论列表中的元素数量如何,这种方法都有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM