简体   繁体   中英

Remove Keys in Dictionary and Store Values in a Dictionary of Lists

So I have a dictionary with "items" which has a list of dictionaries. I'm trying to restructure it to be a dictionary of "items" that has a list of lists that contain the values of the previous dictionary's keys.

Original:

data = { 
   "items": [ 
           { "A": 0.00, "B": 33.27, "C": "string", "D": "16122 " }, 
           { "A": 0.00, "B": 5176.66, "C": "string", "D": "21216 " } 
            ] 
       }

What I want to get:

data = { 
    "items": [ 
           [ 0.00, 33.27, "string", "16122 " ], 
           [ 0.00, 5176.66, "string", "21216 " ] 
             ] 
         }

It seems like operator.itemgetter is almost what you want:

getter = operator.itemgetter('A', 'B', 'C', 'D')
data = {'items': [getter(dct) for dct in data['items']]}

In this case you end up with a list of tuple , not a list of list , but in many applications, that's probably OK.

Demo:

>>> data = { 
...    "items": [ 
...            { "A": 0.00, "B": 2184.83, "C": "string", "D": "16122 " }, 
...            { "A": 0.00, "B": 5176.66, "C": "string", "D": "21216 " } 
...             ] 
...        }
>>> import operator
>>> getter = operator.itemgetter('A', 'B', 'C', 'D')
>>> data = {'items': [getter(dct) for dct in data['items']]}
>>> data['items'][0]
(0.0, 2184.83, 'string', '16122 ')
>>> data['items'][1]
(0.0, 5176.66, 'string', '21216 ')

Here is one way to do exactly the way you wanted.

#Get the column names from the first record
colNames =data['items'][0].keys()
#Get values from all records that have the same keys as in the first record
newData = { 'items' : [[record[colName] for colName in colNames] \
                   for record in data['items']] }
print newData

output:

{'items': [[0.0, 'string', 33.27, '16122 '], [0.0, 'string', 5176.66, '21216 ']]}

Keep in mind that dicts are unordered -- therefore, you need to specify the order of the keys to get a correlated order of the values when mapped to a list. The order of the keys will not necessarily be the order that they are declared, the order they were last time you looked, etc.

So a more realistic example data is:

data = { 
   "items": [ 
           { "D": "16122 ", "A": 0.00, "B": 33.27, "C": "string" }, 
           { "B": 5176.66, "A": 0.00,  "D": "21216 ", "C": "string" } 
            ] 
       }

To map unordered keys into an ordered list, you need to pick what order you will use. Suppose you settle on the ascii betical order as the order:

ordered_keys=("A", "B", "C", "D")    

Then you can convert to your structure with a simple loop:

for k, LoD in data.items():      # consider '.iteritems() on Py 2 and larger dicts...
    data[k]=[[di[sk] for sk in ordered_keys] for di in LoD]

>>> data
{'items': [[0.0, 33.27, 'string', '16122 '], [0.0, 5176.66, 'string', '21216 ']]}

Now you need to decide what to do with keys that may be missing in the list of dicts. Unless each dict has exactly the same keys, you need a default value.

Here is a way you could do that:

data = { 
   "items": [ 
           { "D": "16122 ", "A": 0.00, "B": 33.27, "C": "string" }, 
           { "B": 5176.66, "A": 0.00,  "D": "21216 ", "C": "string" }, 
           {  "E": "New Key ", "C": "'A' and 'B' are missing in this dict" } 
            ] 
       }

for k, LoD in data.items():     
    keys=sorted({e for sk in LoD for e in sk})
    data[k]=[keys]+[[di.get(sk, None) for sk in keys] for di in LoD]

In this case, all the keys in the list of dicts are gathered, sorted, then made the first element in the list of lists (so you know which is which and other keys of data may have different set of keys.):

data = { 
   "items": [ 
           { "D": "16122 ", "A": 0.00, "B": 33.27, "C": "string" }, 
           { "B": 5176.66, "A": 0.00,  "D": "21216 ", "C": "string" }, 
           {  "E": "New Key ", "C": "'A' and 'B' are missing in this dict" } 
            ],
    "More": [
           { "D": "16122 ", "A": 0.00, "B": 33.27, "C": "string" }
            ]             
       }

for k, LoD in data.items():     
    keys=sorted({e for sk in LoD for e in sk})
    data[k]=[keys]+[[di.get(sk, None) for sk in keys] for di in LoD]

Result:

>>> for k in data:
...     print k+':'+'\n\t'+'\n\t'.join(repr(e) for e in data[k])
items:
    ['A', 'B', 'C', 'D', 'E']
    [0.0, 33.27, 'string', '16122 ', None]
    [0.0, 5176.66, 'string', '21216 ', None]
    [None, None, "'A' and 'B' are missing in this dict", None, 'New Key ']
More:
    ['A', 'B', 'C', 'D']
    [0.0, 33.27, 'string', '16122 ']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM