简体   繁体   English

查找列表中每个分类器的平均坐标

[英]Find the average co-ordinates for each classifier in a list

I want to write a function that takes two inputs:我想编写一个接受两个输入的函数:

points is a list of co-ordinate points and; points是一个坐标点列表,并且;

classiification is a list of 1s or 0s of n-by-m length, where n is the number of values in points , and m is is the number of classifications. classiification是一个长度为 n×m 的 1 或 0 的列表,其中 n 是points的值的数量,而 m 是分类的数量。

The function would return the average of the co-ordinates assigned to each classification.该函数将返回分配给每个分类的坐标的平均值。 In the example there are 2 classifications, and each co-ordinate in points can only be assigned to one classification (labelled with a 1, all others labelled a 0).在示例中,有 2 个分类,每个坐标points只能分配给一个分类(标记为 1,所有其他标记为 0)。

Example below:下面的例子:

points = np.array([[1,1], [2,4], [4,6], [5,6], [6,6]])
classification = np.array([[1, 0],[1, 0],[0, 1],[0, 1],[0, 1]])
my_func(points, classification) #--> np.array([[1.5 , 2.5],
                                #              [5. , 6. ]])

So the first point, (1,1) has been assigned to the first classifier (1,0) and the third point (4,6) has been assigned to the second classifier (0,1).因此,第一个点 (1,1) 已分配给第一个分类器 (1,0),而第三个点 (4,6) 已分配给第二个分类器 (0,1)。

What is the best way to approach this?解决这个问题的最佳方法是什么? Thanks谢谢

  1. create two arrays, result and count , both with the number of classifications as their size, initialize each value to [0, 0] for result and 0 for count.创建两个数组, resultcount ,都以分类数作为它们的大小,将每个值初始化为[0, 0]结果和0计数。
  2. take the next point and classification until there are none left取下一个点和分类,直到没有剩下的
  3. use classification.index(1) to find the index for the result and count array使用classification.index(1)找到resultcount数组的索引
  4. add the values of the point to the corresponding result and increment the corresponding count将点的值添加到相应的result并增加相应的count
  5. repeat step 2重复步骤 2
  6. divide each value in result by it's corresponding count value将结果中的每个值除以相应的count数值
  7. return result返回result

I'll leave it up to you to write the code for it.我会让你为它编写代码。

Since dictionaries are the easiest way to operate on the data involving mapping.由于字典是对涉及映射的数据进行操作的最简单方法。 I've used a dictionary to solve your question.我用字典来解决你的问题。

points = np.array([[1,1], [2,4], [4,6], [5,6], [6,6]])
classification = np.array([[1, 0],[1, 0],[0, 1],[0, 1],[0, 1]])

I'm converting the list of lists to list of tuples in the below step as lists cannot act as keys for dictionaries due to their mutable nature.我在下面的步骤中将列表列表转换为元组列表,因为列表由于其可变性而不能作为字典的键。

classification =[tuple(i) for i in classification]
dic={}
for i,j in zip(classification,points):
    if i not in dic.keys():
        dic[i]=[list(j)]
    else:
        dic[i].append(list(j))
[[sum(elem)/len(elem) for elem in zip(*j)] for i,j in dic.items()]

Hope that helps.希望有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM