简体   繁体   中英

Why do I get 'unhashable type: dict' error when recursively cleaning json object?

I am trying to clean a json object by removing keys if their value is 'N/A', '-', or '' and likewise removing any of these values from any lists. Example of object to be cleaned:

dirty = {
    'name': {'first': 'Robert', 'middle': '', 'last': 'Smith'},
    'age': 25,
    'DOB': '-',
    'hobbies': ['running', 'coding', '-'],
    'education': {'highschool': 'N/A', 'college': 'Yale'}
}

I found a similar problem and modified the solution, giving this function:

def clean_data(value):
    """
    Recursively remove all values of 'N/A', '-', and '' 
    from dictionaries and lists, and return
    the result as a new dictionary or list.
    """
    missing_indicators = set(['N/A', '-', ''])
    if isinstance(value, list):
        return [clean_data(x) for x in value if x not in missing_indicators]
    elif isinstance(value, dict):
        return {
            key: clean_data(val)
            for key, val in value.items()
            if val not in missing_indicators
        }
    else:
        return value

But I get the unhashable type: dict error from the dictionary comprehension:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-79-d42b5f1acaff> in <module>
----> 1 clean_data(dirty)

<ipython-input-72-dde33dbf1804> in clean_data(value)
     11         return {
     12             key: clean_data(val)
---> 13             for key, val in value.items()
     14             if val not in missing_indicators
     15         }

<ipython-input-72-dde33dbf1804> in <dictcomp>(.0)
     12             key: clean_data(val)
     13             for key, val in value.items()
---> 14             if val not in missing_indicators
     15         }
     16     else:

TypeError: unhashable type: 'dict'

Obviously something about the way I do the set comparison doesn't work the way I think it should when val is a dict. Can anyone enlighten me?

At first glance, this looks like a problem:

if val not in missing_indicators

When you use in on a set , it will check if the value you're asking about is among the set entries. To be a key in a dict or a member of a set in Python, the value you're using must be hashable . You can check if a value in Python is hashable by running hash on it:

>>> hash(1)
1
>>> hash("hello")
7917781502247088526
>>> hash({"1":"2"})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'

In your snippet, it looks like val is a dict and you are asking Python if this val is one of the values present in the set . In response, Python attempts to hash val , but this fails.

The hurdle you have to overcome is that some of the values in your outer dict are themselves a dict , whereas other values look like list , str or int . You will need different strategies in each case: check what type of thing val is and then act accordingly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM