简体   繁体   English

重新编号数组中元素的有效方法

[英]Efficient way of re-numbering elements in an array

I am reasonably new to python and am trying to implement a g.netic algorithm, but need some assistance with the code for one of the operations.我是 python 的新手,正在尝试实现 g.netic 算法,但需要一些操作代码方面的帮助。

I have formulated the problem this way:我这样表述问题:

  • each individual I is represented by a string of M integers每个个体I由一串M整数表示
  • each element e in I takes a value from 0 to N I中的每个元素e取一个从 0 到N的值
  • every number from 0 - N must appear in I at least once从 0 到N的每个数字必须在I中至少出现一次
  • the value of e is not important, so long as each uniquely valued element takes the same unique value (think of them as class labels) e的值并不重要,只要每个具有唯一值的元素都具有相同的唯一值(将它们视为 class 标签)
  • e is less than or equal to N e小于或等于N
  • N can be different for each I每个IN可以不同

after applying the crossover operation i can potentially generate children which violate one or more of these constraints, so i need to find a way to re-number the elements so that they retain their properties, but fit with the constraints.在应用交叉操作后,我可能会生成违反一个或多个这些约束的子项,因此我需要找到一种方法来重新编号元素,以便它们保留其属性,但符合约束条件。

for example:例如:

parent_1 (N=5): [1 3 5 4 2 1|0 0 5 2]
parent_2 (N=3): [2 0 1 3 0 1|0 2 1 3]

*** crossover applied at "|" ***

child_1: [1 3 5 4 2 1 0 2 1 3]
child_2: [2 0 1 3 0 1 0 0 5 2]

child_1 obviously still satisfies all of the constraints, as N = 5 and all values 0-5 appear at least once in the array. child_1显然仍然满足所有约束,因为 N = 5 并且所有值 0-5 在数组中至少出现一次。

The problem lies with child 2 - if we use the max(child_2) way of calculating N we get a value of 5, but if we count the number of unique values then N = 4, which is what the value for N should be.问题在于孩子 2 - 如果我们使用max(child_2)方法计算 N,我们得到的值为 5,但如果我们计算唯一值的数量,则 N = 4,这就是 N 的值应该是什么。 What I am asking (in a very long winded way, granted) is what is a good, pythonic way of doing this:我要问的(以一种非常冗长的方式,理所当然)是什么是一种好的 pythonic 方式来做到这一点:

child_2: [2 0 1 3 0 1 0 0 5 2]
*** some python magic ***
child_2':  [2 0 1 3 0 1 0 0 4 2]
*or*
child_2'': [0 1 2 3 1 2 1 1 4 0]

child_2'' is there to illustrate that the values themselves dont matter, so long as each element of a unique value maps to the same value, the constraints are satisfied. child_2''是为了说明值本身并不重要,只要唯一值的每个元素映射到相同的值,就满足约束条件。

here is what i have tried so far:到目前为止,这是我尝试过的:

value_map = []
for el in child:
    if el not in value_map:
        value_map.append(el)

for ii in range(0,len(child)):
    child[ii] = value_map.index(child[ii])

this approach works and returns a result similar to child_2'' , but i can't imagine that it is very efficient in the way it iterates over the string twice, so i was wondering if anyone has any suggestions of how to make it better.这种方法有效并返回类似于child_2''的结果,但我无法想象它在字符串上迭代两次的方式非常有效,所以我想知道是否有人对如何改进它有任何建议。

thanks, and sorry for such a long post for such a simple question!谢谢,很抱歉为这么简单的问题发了这么长的帖子!

You will need to iterates the list more than once, I don't think there's any way around this. 你需要多次迭代列表,我认为没有办法解决这个问题。 After all, you first have to determine the number of different elements (first pass) before you can start changing elements (second pass). 毕竟,在开始更改元素(第二遍)之前,首先必须确定不同元素的数量(第一遍)。 Note, however, that depending on the number of different elements you might have up to O(n^2) due to the repetitive calls to index and not in , which have O(n) on a list. 但请注意,由于重复调用indexnot in ,因此在列表中有O(n),因此可能最多为O(n ^ 2)的不同元素的数量。

Alternatively, you could use a dict instead of a list for your value_map . 或者,您可以使用dict而不是value_maplist A dictionary has much faster lookup than a list, so this way, the complexity should indeed be on the order of O(n). 字典比列表具有更快的查找速度,因此,复杂性应该确实在O(n)的数量级上。 You can do this using (1) a dictionary comprehension to determine the mapping of old to new values, and (2) a list comprehension for creating the updated child. 您可以使用(1)字典理解来确定旧值到新值的映射,以及(2)用于创建更新子项的列表理解。

value_map = {el: i for i, el in enumerate(set(child))}
child2 = [value_map[el] for el in child]

Or change the child in-place using a for loop. 或者使用for循环就地改变孩子。

for i, el in enumerate(child):
    child[i] = value_map[el]

You can do it with a single loop like this: 您可以使用以下单循环执行此操作:

value_map = []
result = []
for el in child:
    if el not in value_map:
        value_map.append(el)
    result.append(value_map.index(el))

One solution I can think of is: 我能想到的一个解决方案是:

  1. Determine the value of N and determine unused integers. 确定N的值并确定未使用的整数。 (this forces you to iterate over the array once) (这迫使你迭代数组一次)
  2. Go through the array and each time you meet a number superior to N, map it to an unused integer. 遍历数组,每次遇到优于N的数字时,将其映射到未使用的整数。

This forces you to go through the arrays twice, but it should be faster than your example (that forces you to go through the value_map at each element of the array at each iteration) 这会强制您遍历数组两次,但它应该比您的示例更快(这会强制您在每次迭代时遍历数组的每个元素的value_map

child = [2, 0, 1, 3, 0, 1, 0, 0, 5, 2]

used = set(child)
N = len(used) - 1
unused = set(xrange(N+1)) - used

value_map = dict()
for i, e in enumerate(child):
    if e <= N:
        continue
    if e not in value_map:
        value_map[e] = unused.pop()
    child[i] = value_map[e]
print child # [2, 0, 1, 3, 0, 1, 0, 0, 4, 2]

I like @Selçuk Cihan answer. 我喜欢@Selçuk吉汗的回答。 It can also be done in place. 它也可以在适当的地方完成。

>>> child = [2, 0, 1, 3, 0, 1, 0, 0, 5, 2]
>>>
>>> value_map = []
>>> for i in range(len(child)):
...     el = child[i]
...     if el not in value_map:
...         value_map.append(el)
...     child[i] = value_map.index(el)
...
>>> child
[0, 1, 2, 3, 1, 2, 1, 1, 4, 0]

I believe that this works, although I didn't test it for more than the single case that is given in the question. 我相信这是有效的,虽然我没有测试它超过问题中给出的单个案例。

The only thing that bothers me is that value_map appears three times in the code... 唯一困扰我的是value_map在代码中出现三次......

def renumber(individual):
    """
    >>> renumber([2, 0, 1, 3, 0, 1, 0, 0, 4, 2])
    [0, 1, 2, 3, 1, 2, 1, 1, 4, 0]
    """
    value_map = {}
    return [value_map.setdefault(e, len(value_map)) for e in individual]

Here is a fast solution, which iterates the list only once.这是一个快速的解决方案,它只迭代列表一次。

a = [2, 0, 1, 3, 0, 1, 0, 0, 5, 2]
b = [-1]*len(a)
j = 0
for i in range(len(a)):
    if b[a[i]] == -1:
        b[a[i]] = j
        a[i] = j
        j += 1
    else:
        a[i] = b[a[i]]

print(a) # [0, 1, 2, 3, 1, 2, 1, 1, 4, 0]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM