[英]Python : Counting unique values in list
I'm trying to add up the number of equal values in a list.我正在尝试将列表中相等值的数量相加。 The list looks like this:该列表如下所示:
list = [["APP", "X", "v3", "CN_L", "2"],
["APP2", "X", "v3", "CN_M", "2"],
["APP3", "Z", "v3", "CN_L", "2"],
["APP2", "Z", "v3", "CN_M", "2"]]
etc.等等
I am mainly concerned with the number of times the 4th field is found.我主要关心找到第 4 个字段的次数。
I am not very experienced in Python.我对 Python 不是很有经验。 I had already found something about Counter and I tried something with it.我已经找到了一些关于 Counter 的东西,并用它尝试了一些东西。
from collections import Counter
list = [["APP", "X", "v3", "CN_L", "2"],
["APP2", "X", "v3", "CN_M", "2"],
["APP3", "Z", "v3", "CN_L", "2"],
["APP2", "Z", "v3", "CN_M", "2"]]
distinct_list=(Counter(list).keys())
Without for loop I get nothing from this code, and get an unhashable type back.如果没有 for 循环,我从这段代码中什么也得不到,并且得到了一个不可散列的类型。 Who can push me in the right direction?谁能把我推向正确的方向?
Use [l[3] for l in my_list]
to get the elements at index 3 (4th elements), then simply calling Counter
on your list will give you the unique elements and their count.使用[l[3] for l in my_list]
获取索引 3 处的元素(第 4 个元素),然后只需在列表中调用Counter
即可为您提供唯一元素及其计数。
from collections import Counter
my_list = [["APP", "X", "v3", "CN_L", "2"],
["APP2", "X", "v3", "CN_M", "2"],
["APP3", "Z", "v3", "CN_L", "2"],
["APP2", "Z", "v3", "CN_M", "2"]]
forth_elts = [l[3] for l in my_list]
print(Counter(forth_elts))
>>> Counter({'CN_M': 2, 'CN_L': 2})
And please avoid using keywords and other words such as "str", or "list" to name your variables.并且请避免使用关键字和其他词,例如“str”或“list”来命名变量。
from collections import Counter
new_list = [["APP", "X", "v3", "CN_L", "2"],
["APP2", "X", "v3", "CN_M", "2"],
["APP3", "Z", "v3", "CN_L", "2"],
["APP2", "Z", "v3", "CN_M", "2"]]
#import numpy library
import numpy as np
#convert the list into a numpy array
arr=np.array(new_list)
#take the 4 th column and then apply the counter
result=Counter(arr[:,4])
I would put the data into a pandas Dataframe like this:我会将数据放入 pandas Dataframe 中,如下所示:
import pandas as pd
df = pd.DataFrame(
[["APP", "X", "v3", "CN_L", "2"],
["APP2", "X", "v3", "CN_M", "2"],
["APP3", "Z", "v3", "CN_L", "2"],
["APP2", "Z", "v3", "CN_M", "2"]]
)
df[4].value_counts()
->
2 4
Name: 4, dtype: int64
It will return you a pandas Series which is basically working like a dict so you can do:它会返回一个 pandas 系列,它基本上像字典一样工作,所以你可以这样做:
x = df[4].value_counts()
x["2"] --> 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.