[英]Mapping list changes to their new index Python
I am working on a piece of software that clusters images for the user to label.我正在开发一个软件,它可以将图像聚集起来供用户标记。 Each iteration the user can merge clusters or rename the label of the clusters and I am looking for an algorithm to map the previous cluster index to its new index based on the previous cluster list and the input cluster list.每次迭代,用户都可以合并集群或重命名集群的标签,我正在寻找一种算法来根据之前的集群列表和输入的集群列表将之前的集群索引映射到它的新索引。 I am holding the previous cluster's labeled names in a previous_classes list.我在 previous_classes 列表中保存了前一个集群的标记名称。 If the user marks 'Ignore', map new cluster to -1 and remove cluster.如果用户标记“忽略”,则将新集群映射到 -1 并删除集群。 Below is the workflow with 4 edge-cases I am looking to account for:以下是我要考虑的 4 个边缘案例的工作流程:
Iteration 1:迭代 1:
Merging ClassC to ClassE将 C 类合并到 E 类
Input:输入:
previous_clusters = ["ClassA", "ClassB", "ClassC", "ClassD", "ClassE"]
clusters = ["ClassA", "ClassB", "ClassE", "ClassD", "ClassE"]
desired output:所需的输出:
{0:0, 1:1, 2:2, 3:3, 4:2}
Iteration 2:迭代 2:
Merging classA to ClassE将 A 类合并到 E 类
Input:输入:
previous_clusters = ["ClassA", "ClassB", "ClassE", "ClassD"]
clusters = ["ClassE", "ClassB", "ClassE", "ClassD"]
desired output:所需的输出:
{0:0, 1:1, 2:0, 3:2}
Iteration 3:迭代 3:
Renaming classB to ClassF gives将 classB 重命名为 ClassF 给出
Input:输入:
previous_clusters = ["ClassE", "ClassB", "ClassD"]
clusters = ["ClassE", "ClassF", "ClassD"]
desired output:所需的输出:
{0:0, 1:1, 2:2}
Iteration 4迭代 4
Ignoring ClassE忽略 E 类
Input:输入:
previous_clusters = ["ClassE", "ClassF", "ClassD"]
clusters = ["Ignore", "ClassF", "ClassD"]
desired output:所需的输出:
{0:-1, 1:0, 2:1}
previous_clusters = ["ClassF", "ClassD"]
Note that you don't need previous_clusters
(although it was helpful for me to understand the context).请注意,您不需要previous_clusters
(尽管它有助于我理解上下文)。 The only information you need is something like "as for index 0, the user selects 'ClassA'
".您需要的唯一信息是“对于索引 0,用户选择'ClassA'
”。 You can collect all indices that maps to 'ClassA'
, and then invert the map (while giving unique indices to the new classes, and dealing with -1
).您可以收集映射到'ClassA'
所有索引,然后反转映射(同时为新类提供唯一索引,并处理-1
)。
from collections import defaultdict
def recluster(new):
indices_mapped_to = defaultdict(list)
indices_ignored = [] # list of indices to be ignored
for i, new_class in enumerate(new):
if new_class == 'Ignore':
indices_ignored.append(i)
else:
indices_mapped_to[new_class].append(i)
# "invert" the dict
output = {j: i for i, v in enumerate(indices_mapped_to.values()) for j in v}
output.update({j: -1 for j in indices_ignored}) # add the ignored cases
return output
print(recluster(["ClassA", "ClassB", "ClassE", "ClassD", "ClassE"]))
# {0: 0, 1: 1, 2: 2, 4: 2, 3: 3}
print(recluster(["ClassE", "ClassB", "ClassE", "ClassD"]))
# {0: 0, 2: 0, 1: 1, 3: 2}
print(recluster(["ClassE", "ClassF", "ClassD"]))
# {0: 0, 1: 1, 2: 2}
print(recluster(["Ignore", "ClassF", "ClassD"]))
# {1: 0, 2: 1, 0: -1}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.