[英]Efficient way of re-numbering elements in an array
I am reasonably new to python and am trying to implement a g.netic algorithm, but need some assistance with the code for one of the operations.我是 python 的新手,正在尝试实现 g.netic 算法,但需要一些操作代码方面的帮助。
I have formulated the problem this way:我这样表述问题:
I
is represented by a string of M
integersI
由一串M
整数表示e
in I
takes a value from 0 to N
I
中的每个元素e
取一个从 0 到N
的值N
must appear in I
at least onceN
的每个数字必须在I
中至少出现一次e
is not important, so long as each uniquely valued element takes the same unique value (think of them as class labels) e
的值并不重要,只要每个具有唯一值的元素都具有相同的唯一值(将它们视为 class 标签)e
is less than or equal to N
e
小于或等于N
N
can be different for each I
I
的N
可以不同after applying the crossover operation i can potentially generate children which violate one or more of these constraints, so i need to find a way to re-number the elements so that they retain their properties, but fit with the constraints.在应用交叉操作后,我可能会生成违反一个或多个这些约束的子项,因此我需要找到一种方法来重新编号元素,以便它们保留其属性,但符合约束条件。
for example:例如:
parent_1 (N=5): [1 3 5 4 2 1|0 0 5 2]
parent_2 (N=3): [2 0 1 3 0 1|0 2 1 3]
*** crossover applied at "|" ***
child_1: [1 3 5 4 2 1 0 2 1 3]
child_2: [2 0 1 3 0 1 0 0 5 2]
child_1
obviously still satisfies all of the constraints, as N = 5 and all values 0-5 appear at least once in the array. child_1
显然仍然满足所有约束,因为 N = 5 并且所有值 0-5 在数组中至少出现一次。
The problem lies with child 2 - if we use the max(child_2)
way of calculating N we get a value of 5, but if we count the number of unique values then N = 4, which is what the value for N should be.问题在于孩子 2 - 如果我们使用
max(child_2)
方法计算 N,我们得到的值为 5,但如果我们计算唯一值的数量,则 N = 4,这就是 N 的值应该是什么。 What I am asking (in a very long winded way, granted) is what is a good, pythonic way of doing this:我要问的(以一种非常冗长的方式,理所当然)是什么是一种好的 pythonic 方式来做到这一点:
child_2: [2 0 1 3 0 1 0 0 5 2]
*** some python magic ***
child_2': [2 0 1 3 0 1 0 0 4 2]
*or*
child_2'': [0 1 2 3 1 2 1 1 4 0]
child_2''
is there to illustrate that the values themselves dont matter, so long as each element of a unique value maps to the same value, the constraints are satisfied. child_2''
是为了说明值本身并不重要,只要唯一值的每个元素映射到相同的值,就满足约束条件。
here is what i have tried so far:到目前为止,这是我尝试过的:
value_map = []
for el in child:
if el not in value_map:
value_map.append(el)
for ii in range(0,len(child)):
child[ii] = value_map.index(child[ii])
this approach works and returns a result similar to child_2''
, but i can't imagine that it is very efficient in the way it iterates over the string twice, so i was wondering if anyone has any suggestions of how to make it better.这种方法有效并返回类似于
child_2''
的结果,但我无法想象它在字符串上迭代两次的方式非常有效,所以我想知道是否有人对如何改进它有任何建议。
thanks, and sorry for such a long post for such a simple question!谢谢,很抱歉为这么简单的问题发了这么长的帖子!
You will need to iterates the list more than once, I don't think there's any way around this. 你需要多次迭代列表,我认为没有办法解决这个问题。 After all, you first have to determine the number of different elements (first pass) before you can start changing elements (second pass).
毕竟,在开始更改元素(第二遍)之前,首先必须确定不同元素的数量(第一遍)。 Note, however, that depending on the number of different elements you might have up to O(n^2) due to the repetitive calls to
index
and not in
, which have O(n) on a list. 但请注意,由于重复调用
index
而not in
,因此在列表中有O(n),因此可能最多为O(n ^ 2)的不同元素的数量。
Alternatively, you could use a dict
instead of a list
for your value_map
. 或者,您可以使用
dict
而不是value_map
的list
。 A dictionary has much faster lookup than a list, so this way, the complexity should indeed be on the order of O(n). 字典比列表具有更快的查找速度,因此,复杂性应该确实在O(n)的数量级上。 You can do this using (1) a dictionary comprehension to determine the mapping of old to new values, and (2) a list comprehension for creating the updated child.
您可以使用(1)字典理解来确定旧值到新值的映射,以及(2)用于创建更新子项的列表理解。
value_map = {el: i for i, el in enumerate(set(child))}
child2 = [value_map[el] for el in child]
Or change the child in-place using a for
loop. 或者使用
for
循环就地改变孩子。
for i, el in enumerate(child):
child[i] = value_map[el]
You can do it with a single loop like this: 您可以使用以下单循环执行此操作:
value_map = []
result = []
for el in child:
if el not in value_map:
value_map.append(el)
result.append(value_map.index(el))
One solution I can think of is: 我能想到的一个解决方案是:
This forces you to go through the arrays twice, but it should be faster than your example (that forces you to go through the value_map
at each element of the array at each iteration) 这会强制您遍历数组两次,但它应该比您的示例更快(这会强制您在每次迭代时遍历数组的每个元素的
value_map
)
child = [2, 0, 1, 3, 0, 1, 0, 0, 5, 2]
used = set(child)
N = len(used) - 1
unused = set(xrange(N+1)) - used
value_map = dict()
for i, e in enumerate(child):
if e <= N:
continue
if e not in value_map:
value_map[e] = unused.pop()
child[i] = value_map[e]
print child # [2, 0, 1, 3, 0, 1, 0, 0, 4, 2]
I like @Selçuk Cihan answer. 我喜欢@Selçuk吉汗的回答。 It can also be done in place.
它也可以在适当的地方完成。
>>> child = [2, 0, 1, 3, 0, 1, 0, 0, 5, 2]
>>>
>>> value_map = []
>>> for i in range(len(child)):
... el = child[i]
... if el not in value_map:
... value_map.append(el)
... child[i] = value_map.index(el)
...
>>> child
[0, 1, 2, 3, 1, 2, 1, 1, 4, 0]
I believe that this works, although I didn't test it for more than the single case that is given in the question. 我相信这是有效的,虽然我没有测试它超过问题中给出的单个案例。
The only thing that bothers me is that value_map
appears three times in the code... 唯一困扰我的是
value_map
在代码中出现三次......
def renumber(individual):
"""
>>> renumber([2, 0, 1, 3, 0, 1, 0, 0, 4, 2])
[0, 1, 2, 3, 1, 2, 1, 1, 4, 0]
"""
value_map = {}
return [value_map.setdefault(e, len(value_map)) for e in individual]
Here is a fast solution, which iterates the list only once.这是一个快速的解决方案,它只迭代列表一次。
a = [2, 0, 1, 3, 0, 1, 0, 0, 5, 2]
b = [-1]*len(a)
j = 0
for i in range(len(a)):
if b[a[i]] == -1:
b[a[i]] = j
a[i] = j
j += 1
else:
a[i] = b[a[i]]
print(a) # [0, 1, 2, 3, 1, 2, 1, 1, 4, 0]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.