简体   繁体   English

python用唯一数字替换列表中的元素

[英]python replace the elements in the list with unique numbers

I'm stuck with some problem for what I couldn't find a solution yet. 我因找不到解决方案而遇到一些问题。

I have a list which looks like his and has actually 60000 elements: 我有一个看起来像他的清单,实际上有60000个元素:

lst = [[0, [-24.75, -24.75, -25.0], 0.00001, 2],
[10, [-26, -26, -26], 0.00011, 4], 
[0, [-3, -200000, -25.0], 0.000009, 42], 
[0, [-4, -4.7, -5], 0.00801, 7], 
[1, [-3, -200000, -25.0], 0.00089, 8], 
[2, [-3, -200000, -25.0], 0.000899, 18]]

Elements [lst[i][1] for i in range(len(lst))] are supposed to be Cartesian coordinates and some of them occur more than once. [lst[i][1] for i in range(len(lst))]元素[lst[i][1] for i in range(len(lst))]被认为是笛卡尔坐标,其中一些不止一次出现。

I would like to give every coordinate a unique number, so that the list becomes: 我想给每个坐标一个唯一的数字,这样列表就变成了:

lst = [[0, 0, 0.00001, 2],
[10, 1, 0.00011, 4], 
[0, 2, 0.000009, 42], 
[0, 3, 0.00801, 7], 
[1, 2, 0.00089, 8], 
[2, 2, 0.000899, 18]]

This means, if a coordinate [-3, -200000, -25.0] was once replaced with a number (here 2 ), all duplicates of this coordinate must also be replaced by the same number. 这意味着,如果坐标[-3, -200000, -25.0]一次被替换为数字(此处为2 ),则此坐标的所有重复项也必须替换为相同的数字。

It is also okay if elements in the original list lst will be rearranged, main point is that I need to replace all of the coordinate triples with numbers. 如果可以重新排列原始列表lst中的元素,也可以,主要要点是我需要将所有坐标三元组替换为数字。

Thank you in advance. 先感谢您。

We can first construct a defaultdict : 我们可以先构造一个defaultdict

from collections import defaultdict
from itertools import count

dispatcher = defaultdict(lambda c=count(0): next(c))

And now we can simply replace with a lookup. 现在我们可以简单地用查找替换。 Now since the elements are not hashable, we can not simply add them to the dictionary. 现在,由于元素不可哈希,因此我们不能简单地将它们添加到字典中。 But we can convert them into a tuple. 但是我们可以将它们转换为元组。

So now we can use: 现在我们可以使用:

for lsti in lst:
    lsti[1] = dispatcher[tuple(lsti[1])]

For the given input this generates: 对于给定的输入,将生成:

>>> lst
[[0, 0, 1e-05, 2], [10, 1, 0.00011, 4], [0, 2, 9e-06, 42], [0, 3, 0.00801, 7], [1, 2, 0.00089, 8], [2, 2, 0.000899, 18]]

Make a dictionary of values you've already seen. 制作一个已经看过的值的字典。 For each item in the list, if its coordinates are in that dictionary, use its saved value. 对于列表中的每个项目,如果其坐标在该词典中,请使用其保存的值。 Otherwise, insert that key into the dictionary and update the running count: 否则,将该键插入字典并更新运行计数:

coordmap = {}
count = 0
for item in lst:
    coords = tuple(item[1])
    try:
        new_value = coordmap[coords]
    except KeyError:
        new_value = coordmap[coords] = count
        count += 1
    item[1] = new_value

Tuples are hashable. 元组是可哈希的。 You can use this to maintain a dict of already visited coordinates quite easily. 您可以使用它很容易地维护已访问坐标的字典。

import random  
seen = { }
i = 0
for l in lst:
    coord = tuple(l[1])
    if coord not in seen:
        seen[coord] = i
        i += 1

    l[1] = seen[coord]

Contents: 内容:

[[0, 0, 1e-05, 2],
 [10, 1, 0.00011, 4],
 [0, 2, 9e-06, 42],
 [0, 3, 0.00801, 7],
 [1, 4, 0.00089, 8],
 [2, 5, 0.000899, 18]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM