繁体   English   中英

生成所有排列,包括带权重的缩写

[英]Generate all permutations including abbreviations with weightages

我的弦——

name_target = "ARUN GULABRAO INDULKAR"

我想使用原始名称和缩写生成所有排列,并为每个排列分配权重 -

[ARUNGULABRAOINDULKAR, 1]
[ARUNGINDULKAR, 0.9]
[ARUNGULABRAOI, 0.9]
[AGULABRAOINDULKAR, 0.9]
[ARUNGI, 0.8]
[AGINDULKAR, 0.8]
[AGULABRAOI, 0.8]
[ARUNINDULKARGULABRAO, 1]
[ARUNIGULABRAO, 0.9]
[ARUNINDULKARG, 0.9]
[AINDULKARGULABRAO, 0.9]
[ARUNIG, 0.8]
[AIGULABRAO, 0.8]
[AINDULKARG, 0.8]
[GULABRAOARUNINDULKAR, 1]
[GULABRAOAINDULKAR, 0.9]
[GULABRAOARUNI, 0.9]
[GARUNINDULKAR, 0.9]
[GULABRAOAI, 0.8]
[GAINDULKAR, 0.8]
[GARUNI, 0.8]
[GULABRAOINDULKARARUN, 1]
[GULABRAOIARUN, 0.9]
[GULABRAOINDULKARA, 0.9]
[GINDULKARARUN, 0.9]
[GULABRAOIA, 0.8]
[GIARUN, 0.8]
[GINDULKARA, 0.8]
[INDULKARARUNGULABRAO, 1]
[INDULKARAGULABRAO, 0.9]
[INDULKARARUNG, 0.9]
[IARUNGULABRAO, 0.9]
[INDULKARAG, 0.8]
[IAGULABRAO, 0.8]
[IARUNG, 0.8]
[INDULKARGULABRAOARUN, 1]
[INDULKARGARUN, 0.9]
[INDULKARGULABRAOA, 0.9]
[IGULABRAOARUN, 0.9]
[INDULKARGA, 0.8]
[IGARUN, 0.8]
[IGULABRAOA, 0.8]

不关心这个 output 数据结构,它可以是任何东西。 如果不使用缩写和全名,则权重为1

如果使用缩写,权重会减少 10%。 例如,第 2 行ARUNGINDULKAR中的 ARUNGINDULKAR 得到0.9 ,因为中间名被缩写了。 ARUNGI得到0.8 ,因为中间名和姓氏被缩写了。

我有效地使用了itertools.permutations(name_target)来生成第一组排列。

我无法理解如何组合缩写。 name_target在被' '分割时可以是可变长度

请忽略预期的 output 中的重复项。

您可以使用带有生成器的递归来构建名称缩写组合。 itertools.permutations还用于创建原始输入名称的所有可能排序,并且这些全名组合中的每一个都被传递给get_combos ,其中生成缩写组合。 boolean 标志(全名为True ,缩写为False )与get_combos中生成的每个名称组件相关联,允许计算权重:

from itertools import permutations as prmt
def get_combos(d, l, c = []):
   if d:
      yield from get_combos(d[1:], l, c+[(d[0], True)])
      if sum(not b for _, b in c) + 1 < l:
         yield from get_combos(d[1:], l, c+[(d[0][0], False)])
   else:
      yield [''.join(a for a, _ in c), 1-sum(0.1 for _, b in c if not b)]

name_target = "ARUN GULABRAO INDULKAR"
n = name_target.split()
l = len(n)
result = [i for b in prmt(n, l) for i in get_combos(b, l)]

Output:

[['ARUNGULABRAOINDULKAR', 1], ['ARUNGULABRAOI', 0.9], ['ARUNGINDULKAR', 0.9], ['ARUNGI', 0.8], ['AGULABRAOINDULKAR', 0.9], ['AGULABRAOI', 0.8], ['AGINDULKAR', 0.8], ['ARUNINDULKARGULABRAO', 1], ['ARUNINDULKARG', 0.9], ['ARUNIGULABRAO', 0.9], ['ARUNIG', 0.8], ['AINDULKARGULABRAO', 0.9], ['AINDULKARG', 0.8], ['AIGULABRAO', 0.8], ['GULABRAOARUNINDULKAR', 1], ['GULABRAOARUNI', 0.9], ['GULABRAOAINDULKAR', 0.9], ['GULABRAOAI', 0.8], ['GARUNINDULKAR', 0.9], ['GARUNI', 0.8], ['GAINDULKAR', 0.8], ['GULABRAOINDULKARARUN', 1], ['GULABRAOINDULKARA', 0.9], ['GULABRAOIARUN', 0.9], ['GULABRAOIA', 0.8], ['GINDULKARARUN', 0.9], ['GINDULKARA', 0.8], ['GIARUN', 0.8], ['INDULKARARUNGULABRAO', 1], ['INDULKARARUNG', 0.9], ['INDULKARAGULABRAO', 0.9], ['INDULKARAG', 0.8], ['IARUNGULABRAO', 0.9], ['IARUNG', 0.8], ['IAGULABRAO', 0.8], ['INDULKARGULABRAOARUN', 1], ['INDULKARGULABRAOA', 0.9], ['INDULKARGARUN', 0.9], ['INDULKARGA', 0.8], ['IGULABRAOARUN', 0.9], ['IGULABRAOA', 0.8], ['IGARUN', 0.8]]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM