Python 数据集中的分组和标记值

Question

I am trying to group my dataset into a unique label.我正在尝试将我的数据集分组为一个独特的 label。 Assume I have this data.假设我有这些数据。 Point and its neighbor point in column ABCD.点及其 ABCD 列中的相邻点。

Dataset数据集

Array:大批：

[[1 2]
 [2 1 4 5 7]
 [3 2]
 [4 2 10]
 [5 2 8]
 [6]
 [7 2 13]
 [8 5]
 [9]
 [10 4 1]
 [11 12]
 [12 11]
 [13 7]]

I am trying to summarize the data, and the desired result is as follow:我正在尝试总结数据，期望的结果如下：

Label 1 = 1 2 4 5 7 3 10 8 13
Label 2 = 6
Label 3 = 9
Label 4 = 11 12

The point is when a value is already in a list with label, then give the value the existing label.关键是当一个值已经在 label 的列表中时，然后给该值现有的 label。 But when the value is not in a list, then give it new label.但是当值不在列表中时，则给它新的 label。 I little bit confused how to call this problem, so I not found yet any same problem with mine.我有点困惑如何称呼这个问题，所以我还没有发现任何与我相同的问题。 I would be very thankfull if somebody can give the python code or the pseudocode.如果有人可以提供 python 代码或伪代码，我将非常感激。 Thank you谢谢

Answer 1

here is a working code, the result will be formated in a dictionary where keys are your data and values are labels like this: {key: value, data:label}这是一个工作代码，结果将在字典中格式化，其中键是您的数据，值是这样的标签： {key: value, data:label}

label=0
listOfLabels= dict()
for row in array:
    if not (any(x in row for x in listOfLabels.keys())):
        label+=1
    for i in (i for i in row if i not in listOfLabels.keys()):
        listOfLabels[i]=label
print(listOfLabels)

please let me know if it needs some clarifications如果需要澄清，请告诉我

Python 数据集中的分组和标记值

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-11-15 21:37:47

Python 数据集中的分组和标记值

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-11-15 21:37:47

解决方案1
0 已采纳 2019-11-15 21:37:47