[英]Grouping and labelling values in dataset in Python
I am trying to group my dataset into a unique label.我正在尝试将我的数据集分组为一个独特的 label。 Assume I have this data.
假设我有这些数据。 Point and its neighbor point in column ABCD.
点及其 ABCD 列中的相邻点。
Array:大批:
[[1 2]
[2 1 4 5 7]
[3 2]
[4 2 10]
[5 2 8]
[6]
[7 2 13]
[8 5]
[9]
[10 4 1]
[11 12]
[12 11]
[13 7]]
I am trying to summarize the data, and the desired result is as follow:我正在尝试总结数据,期望的结果如下:
Label 1 = 1 2 4 5 7 3 10 8 13
Label 2 = 6
Label 3 = 9
Label 4 = 11 12
The point is when a value is already in a list with label, then give the value the existing label.关键是当一个值已经在 label 的列表中时,然后给该值现有的 label。 But when the value is not in a list, then give it new label.
但是当值不在列表中时,则给它新的 label。 I little bit confused how to call this problem, so I not found yet any same problem with mine.
我有点困惑如何称呼这个问题,所以我还没有发现任何与我相同的问题。 I would be very thankfull if somebody can give the python code or the pseudocode.
如果有人可以提供 python 代码或伪代码,我将非常感激。 Thank you
谢谢
here is a working code, the result will be formated in a dictionary where keys are your data and values are labels like this: {key: value, data:label}
这是一个工作代码,结果将在字典中格式化,其中键是您的数据,值是这样的标签:
{key: value, data:label}
label=0
listOfLabels= dict()
for row in array:
if not (any(x in row for x in listOfLabels.keys())):
label+=1
for i in (i for i in row if i not in listOfLabels.keys()):
listOfLabels[i]=label
print(listOfLabels)
please let me know if it needs some clarifications如果需要澄清,请告诉我
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.