简体   繁体   中英

How to remove unhashable duplicates from a list in Python?

My data is this:

[{u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': [u'/www/web'], u'server_port': u'80'}, {u'webpath': [u'/www/web'], u'server_port': u'80'}, {u'webpath': [u'/www/shanghu'], u'server_port': u'80'}, {u'webpath': [u'/www/shanghu'], u'server_port': u'80'}, {u'webpath': [u'/www/www/html/falv'], u'server_port': u'80'}, {u'webpath': [u'/www/www/html/falv'], u'server_port': u'80'}, {u'webpath': [u'/www/www/html/falv'], u'server_port': u'80'}, {u'webpath': [u'/www/falvhezi'], u'server_port': u'80'}, {u'webpath': [u'/www/test10'], u'server_port': u'80'}, {u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': u'/etc/html', u'server_port': u'80'}, {u'webpath': [u'/www/400.ask.com'], u'server_port': u'80'}, {u'webpath': [u'/www/www'], u'server_port': u'80'}, {u'webpath': [u'/www/www'], u'server_port': u'80'}, {u'webpath': [u'/www/www'], u'server_port': u'80'}, {u'webpath': [u'/www/zhuanti'], u'server_port': u'80'}, {u'webpath': [u'/www/zhuanti'], u'server_port': u'80'}, {u'webpath': [u'/www/shanghu'], u'server_port': u'80'}]

My code is this:

    seen = set()
    new_webpath_list = []
    for webpath in nginxConfs:
        t = tuple(webpath.items())
        if t not in seen:
            seen.add(t)
            new_webpath_list.append(webpath)

But the script returns:

TypeError: "unhashable type: 'list'"

You are creating tuples from the dictionaries to make them hashable, but there can still be non-hashable lists inside those tuples! Instead, you also have to "tuplefy" the values.

t = tuple(((k, tuple(v)) for (k, v) in webpath.items()))

Note that this is a bit glitchy as the first entry in the dict is just a string, while the others are lists of strings. You could mend this with an if/else , but it should not really be necessary.

t = tuple(((k, tuple(v) if isinstance(v, list) else v) for (k, v) in webpath.items()))

Alternatively, you could also just memorize the string represenations of the dictionaries...

t = repr(webpath)

The most straightforward way to do this is to just test membership directly using the new list you are building.

new_webpath_list = []
for webpath in nginxConfs:
    if webpath not in new_webpath_list:
        new_webpath_list.append(webpath)

This handles the cases where there is an arbitrary (unknown beforehand) level of nesting of unhashable types. It also makes your code simpler, easier to understand, and very possibly more efficient, because you are not creating extra data that you don't need (no seen set, no conversion of elements to tuples).

Late answer, but I was able to remove duplicated dict from a list using:

old_list = [{"x": 1}, {"x": 1}, {"x": 2}]
new_list = []
[new_list.append(x) for x in old_list if x not in new_list]
# [{'x': 1}, {'x': 2}]

Demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM