简体   繁体   English

如何从数据类对象列表中删除重复项,每个对象都有一个列表作为字段?

[英]How can I remove duplicates from a list of dataclass-objects which each have a list as a field?

I have this code:我有这个代码:

from dataclasses import dataclass
from typing import List

@dataclass(eq=True, frozen=True)
class TestClass:
    field1: str
    field_list: List[str]

duplicate_list = [TestClass("foo", ["bar", "cat"]), TestClass("foo", ["bar", "cat"]), TestClass("foo", ["bar", "caz"])]

unique_list = remove_duplicates(duplicate_list)

def remove_duplicates(duplicate_list: List[TestClass]) -> List[TestClass]:
    return list(set(duplicate_list))

Now I want to check the list for duplicates.现在我想检查列表中的重复项。 I tried to convert the list to a set like shown above.我试图将列表转换为如上所示的集合。 I also tried using我也尝试过使用

return list( dict.fromkeys(duplicate_list) )

Both approaches do not work as my class contains a list.这两种方法都不起作用,因为我的 class 包含一个列表。 Because of this the __hash__ function generated by the dataclass module does not work.因此,数据类模块生成的__hash__ function 不起作用。 It gives the error: unhashable type: 'list'它给出了错误: unhashable type: 'list'

What would the correct approach be to remove the duplicate dataclass-elements?删除重复的数据类元素的正确方法是什么? Would I need to write a custom __hash__ function?我需要写一个自定义的__hash__ function 吗? Or would it be possible to replace the list with some form of immutable list?或者是否可以用某种形式的不可变列表替换列表?

You can replace list with tuple (immutable list in python)您可以用tuple替换list (python中的不可变列表)

from dataclasses import dataclass
from typing import List, Tuple


@dataclass(eq=True, frozen=True)
class TestClass:
    field1: str
    field_list: Tuple[str, str]


duplicate_list = [TestClass("foo", ("bar", "cat")), TestClass("foo", ("bar", "cat")), TestClass("foo", ("bar", "caz"))]

Then your original remove_duplicates implementation will work correctly.然后你原来的remove_duplicates实现将正常工作。

def remove_duplicates(duplicate_list: List[TestClass]) -> List[TestClass]:
    return list(set(duplicate_list))

Just change duplicate_list with:只需将 duplicate_list 更改为:

duplicate_list = [TestClass("foo", ["bar", "cat"]).__str__(), TestClass("foo", ["bar", "cat"]).__str__(), TestClass("foo", ["bar", "cat"]).__str__()]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM