如何从 Python 中的列表中删除不可散列的重复项？

Question

我的数据是这样的：

[{u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': [u'/www/web'], u'server_port': u'80'}, {u'webpath': [u'/www/web'], u'server_port': u'80'}, {u'webpath': [u'/www/shanghu'], u'server_port': u'80'}, {u'webpath': [u'/www/shanghu'], u'server_port': u'80'}, {u'webpath': [u'/www/www/html/falv'], u'server_port': u'80'}, {u'webpath': [u'/www/www/html/falv'], u'server_port': u'80'}, {u'webpath': [u'/www/www/html/falv'], u'server_port': u'80'}, {u'webpath': [u'/www/falvhezi'], u'server_port': u'80'}, {u'webpath': [u'/www/test10'], u'server_port': u'80'}, {u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': [u'/www/400.ask.com'], u'server_port': u'80'}, {u'webpath': [u'/www/www'], u'server_port': u'80'}, {u'webpath': [u'/www/www'], u'server_port': u'80'}, {u'webpath': [u'/www/www'], u'server_port': u'80'}, {u'webpath': [u'/www/zhuanti'], u'server_port': u'80'}, {u'webpath': [u'/www/zhuanti'], u'server_port': u'80'}, {u'webpath': [u'/www/shanghu'], u'server_port': u'80'}]

我的代码是这样的：

    seen = set()
    new_webpath_list = []
    for webpath in nginxConfs:
        t = tuple(webpath.items())
        if t not in seen:
            seen.add(t)
            new_webpath_list.append(webpath)

但脚本返回：

TypeError: "unhashable type: 'list'"

Answer 1

您正在创建从字典元组，使他们可哈希的，但仍有可能是那些元组内的非可哈希表！ 相反，您还必须“元组化”这些值。

t = tuple(((k, tuple(v)) for (k, v) in webpath.items()))

请注意，这有点小问题，因为 dict 中的第一个条目只是一个字符串，而其他条目是字符串列表。 您可以使用if/else修复此问题，但这不是必需的。

t = tuple(((k, tuple(v) if isinstance(v, list) else v) for (k, v) in webpath.items()))

或者，您也可以只记住字典的字符串表示...

t = repr(webpath)

Answer 2

最直接的方法是直接使用您正在构建的新列表测试成员资格。

new_webpath_list = []
for webpath in nginxConfs:
    if webpath not in new_webpath_list:
        new_webpath_list.append(webpath)

这可以处理存在任意（事先未知）级别的不可散列类型嵌套的情况。 它还使您的代码更简单、更容易理解，并且很可能更高效，因为您不会创建不需要的额外数据（没有seen集合，没有将元素转换为元组）。

Answer 3

迟到的答案，但我能够使用以下方法从list删除重复的dict ：

old_list = [{"x": 1}, {"x": 1}, {"x": 2}]
new_list = []
[new_list.append(x) for x in old_list if x not in new_list]
# [{'x': 1}, {'x': 2}]

演示

如何从 Python 中的列表中删除不可散列的重复项？

问题描述

3 个解决方案

解决方案1
2 已采纳 2015-07-23 13:55:56

解决方案2
0 2015-07-23 14:03:14

解决方案3
0 2020-06-05 01:44:43

如何从 Python 中的列表中删除不可散列的重复项？

问题描述

3 个解决方案

解决方案1 2 已采纳 2015-07-23 13:55:56

解决方案2 0 2015-07-23 14:03:14

解决方案3 0 2020-06-05 01:44:43

解决方案1
2 已采纳 2015-07-23 13:55:56

解决方案2
0 2015-07-23 14:03:14

解决方案3
0 2020-06-05 01:44:43