如果子列表的第一个元素在 Python 中是唯一的，则从第一个子列表中获取前两项

Question

I have a list:我有一个清单：

df = [['apple', 'red', '0.2'], ['apple', 'green', '8.9'], ['apple', 'brown', '2.9'], 
      ['guava', 'green', '1.9'], ['guava', 'yellow', '4.9'], ['guava', 'light green', '2.3']]

From here I want to only get the first 2 items from the first distinct sublist given the condition that the value of the first sublist is unique.从这里我只想从第一个不同的子列表中获取前 2 个项目，条件是第一个子列表的值是唯一的。

Expected output:预期 output：

df = [['apple', 'red'], ['guava', 'green']]

Code till now:到目前为止的代码：

dummy_list = []

for item in df:
    if item[0] not in dummy_list:        
        dummy_list.append(item[:2])

This is not working and appending all the elements.这不起作用并附加所有元素。 Any help on this please请对此有任何帮助

Answer 1

Or smarter: use a dict and setdefault to add the mapping only for the first或者更聪明：使用 dict 和setdefault只为第一个添加映射

result = {}
for value in df:
    result.setdefault(value[0], value[:2])
result = list(result.values())

print(result)

Or you could keep a count of the added keys to avoid repeating them (in a separate list)或者您可以保留添加键的计数以避免重复它们（在单独的列表中）

keys = set()
result = []
for value in df:
    if value[0] not in keys:
        result.append(value[:2])
        keys.add(value[0])

print(result) # [['apple', 'red'], ['guava', 'green']]

Answer 2

You can use itertools.groupby and for the key use operator.itemgetter :您可以使用itertools.groupby和关键使用operator.itemgetter ：

from itertools import groupby
from operator import itemgetter

df = [['apple', 'red', '0.2'], ['apple', 'green', '8.9'], ['apple', 'brown', '2.9'], 
      ['guava', 'green', '1.9'], ['guava', 'yellow', '4.9'], ['guava', 'light green', '2.3']]

df1 = [next(g)[:2] for k, g in groupby(df, key=itemgetter(0))]

FYI itemgetter(0) is the same as lambda x: x[0] so you could use that too.仅供参考itemgetter(0)与lambda x: x[0]相同，因此您也可以使用它。

Answer 3

When you say unique, do you mean that if you select a value, then you don't want to select it again?当你说唯一的时候，你的意思是如果你 select 一个值，那么你不想再 select 呢？

If that is the case then pop might be useful:如果是这种情况，那么 pop 可能会很有用：

import random as r
df = [['apple', 'red', '0.2'], ['apple', 'green', '8.9'], ['apple', 'brown', '2.9'], 
      ['guava', 'green', '1.9'], ['guava', 'yellow', '4.9'], ['guava', 'light green', '2.3']]

total = len(df)

targetdf = []

for value in range(2):
    position = r.randint(0,total-1)
    targetdf.append(df.pop(position)[:2])
    total-=1

#print(targetdf)

#[['apple', 'green'], ['guava', 'yellow']]

What this code does is that it selects a random position in the original list and then pops it out.这段代码的作用是在原始列表中随机选择一个 position 然后将其弹出。 This value is then saved to a new list.然后将该值保存到新列表中。

Answer 4

you can use defaultdict to store all the values using key-value pairs and then pick only the first value out of that list.您可以使用defaultdict使用键值对存储所有值，然后仅从该列表中选择第一个值。

from collections import defaultdict

df = [
    ["apple", "red", "0.2"],
    ["apple", "green", "8.9"],
    ["apple", "brown", "2.9"],
    ["guava", "green", "1.9"],
    ["guava", "yellow", "4.9"],
    ["guava", "light green", "2.3"],
]
temp = defaultdict(list)
for sub_list in df:
    temp[sub_list[0]].append(sub_list)

df = [value[0][:2] for _, value in temp.items()]

print(df)

Output: Output：

[['apple', 'red'], ['guava', 'green']]

如果子列表的第一个元素在 Python 中是唯一的，则从第一个子列表中获取前两项

问题描述

4 个解决方案

解决方案1
5 已采纳 2020-07-13 21:02:46

解决方案2
2 2020-07-13 21:10:43

解决方案3
0 2020-07-13 21:10:26

解决方案4
0 2020-07-13 21:13:18

如果子列表的第一个元素在 Python 中是唯一的，则从第一个子列表中获取前两项

问题描述

4 个解决方案

解决方案1 5 已采纳 2020-07-13 21:02:46

解决方案2 2 2020-07-13 21:10:43

解决方案3 0 2020-07-13 21:10:26

解决方案4 0 2020-07-13 21:13:18

解决方案1
5 已采纳 2020-07-13 21:02:46

解决方案2
2 2020-07-13 21:10:43

解决方案3
0 2020-07-13 21:10:26

解决方案4
0 2020-07-13 21:13:18