有什么快速的方法可以在Python中标记列表？

Question

I have a list of 200k elements. 我有20万个元素的列表。 Those elements are 7 different labels (it is actually a list of fruit). 这些元素是7个不同的标签（实际上是水果列表）。 I need to assign a number to each fruit. 我需要给每个水果分配一个数字。

Is there a quick way to do this? 有快速的方法吗？

I have written this so far.. and it is taking ages. 到目前为止，我已经写了这本书。

dic,i = {},0.0
for idx,el in enumerate(listFruit):
    if dic.has_key(el) is not True:
        dic[el] = i
        i+=1.0
    listFruit[idx] = dic[el]

Answer 1

Use a collections.defaultdict() object with an itertools.count() object rigged up as to produce a next value as the factory; 将collections.defaultdict()对象与itertools.count()对象结合使用，以产生下一个值作为工厂； this'll avoid having to test for each key yourself as well as having to manually increment. 这样就不必自己测试每个键，也不必手动递增。

Then use a list comprehension to put those numbers into the list: 然后使用列表推导将这些数字放入列表中：

from collections import defaultdict
from functools import partial
from itertools import count

unique_count = defaultdict(partial(next, count(1)))
listFruit[:] = [unique_count[el] for el in listFruit]

The functools.partial() callable creates a wrapper around the next() function , to ensure the code works in either Python 2 or Python 3. functools.partial()调用可在next()函数周围创建包装器，以确保代码可在Python 2或Python 3中工作。

I used an integer count here, starting at 1 . 我在这里使用一个整数，从1开始。 You can replace count(1) with count(1.0) if you insist on having floating point values; 如果您坚持使用浮点值，则可以用count(1.0)替换count(1) ； you'll get 1.0 , 2.0 , 3.0 , etc. instead. 你会得到1.0 ， 2.0 ， 3.0 ，等来代替。

Demo: 演示：

>>> from collections import defaultdict
>>> from functools import partial
>>> from itertools import count
>>> from random import choice
>>> fruits = ['apple', 'banana', 'pear', 'cherry', 'melon', 'kiwi', 'pineapple']
>>> listFruit = [choice(fruits) for _ in xrange(100)]
>>> unique_count = defaultdict(partial(next, count(1)))
>>> [unique_count[el] for el in listFruit]
[1, 2, 3, 2, 4, 5, 6, 7, 1, 2, 4, 6, 3, 7, 3, 4, 5, 2, 5, 7, 3, 5, 1, 3, 3, 5, 2, 2, 6, 4, 6, 2, 1, 1, 3, 6, 6, 4, 7, 2, 6, 4, 5, 2, 1, 7, 7, 7, 4, 3, 7, 3, 1, 1, 5, 3, 3, 6, 5, 6, 1, 4, 3, 7, 2, 7, 7, 4, 7, 1, 4, 3, 7, 3, 4, 5, 1, 5, 5, 1, 5, 6, 3, 4, 3, 1, 1, 1, 5, 7, 2, 2, 6, 3, 6, 1, 1, 6, 5, 4]
>>> unique_count
defaultdict(<functools.partial object at 0x1026c5788>, {'kiwi': 4, 'apple': 1, 'cherry': 5, 'pear': 2, 'pineapple': 6, 'melon': 7, 'banana': 3})

Answer 2

fruit_list = ['apple', 'banana', 'strawberry', 'watermelon','apple','watermelon']

unique_fruits = [x for x in set(fruit_list)]
fruit_dict = dict((unique_fruits[y],y) for y in range(len(unique_fruits)))
result = [(x, fruit_dict.get(x)) for x in fruit_list if x in fruit_dict.keys()]

Something like that? 这样的事吗？

Result: [('apple', 2), ('banana', 3), ('strawberry', 0), ('watermelon', 1), ('apple', 2), ('watermelon', 1)] 结果： [('apple', 2), ('banana', 3), ('strawberry', 0), ('watermelon', 1), ('apple', 2), ('watermelon', 1)]

Or result = [fruit_dict.get(x) for x in fruit_list if x in fruit_dict.keys()] 或result = [fruit_dict.get(x) for x in fruit_list if x in fruit_dict.keys()]

Result - [2, 3, 0, 1, 2, 1] 结果- [2, 3, 0, 1, 2, 1] 2，3，0，1，2，1 [2, 3, 0, 1, 2, 1]

有什么快速的方法可以在Python中标记列表？

问题描述

2 个解决方案

解决方案1
5 已采纳 2015-09-21 14:02:45

解决方案2
0 2015-09-21 14:26:11

有什么快速的方法可以在Python中标记列表？

问题描述

2 个解决方案

解决方案1 5 已采纳 2015-09-21 14:02:45

解决方案2 0 2015-09-21 14:26:11

解决方案1
5 已采纳 2015-09-21 14:02:45

解决方案2
0 2015-09-21 14:26:11