简体   繁体   中英

Converting tuple of dict into tuple of tuple of dict in python

I have query resultset data in format of tuple of dict. I want to group the data into tuple of tuple of dict based on specific condition.

Real Output:

({'col1': 2014},
 {'col1': 2013},
 {'col1': 2014},
 {'col1': 2013},
 {'col1': 2015},
 {'col2': '24'})

Expected output: Here we are grouping base on year

(({'col1': 2014}, {'col1': 2014}),
 ({'col1': 2013}, {'col1': 2013}),
 ({'col1': 2015}, {'col2': '24'}))

Please guide us to get the data, while we are processing the query instead of processing one by one records and converting into specific formate.

You can sort the dicts based on the year and then use groupby with year as key :

>>> from itertools import groupby
>>> t = ({'col1':2014},{'col1':2013},{'col1':2014},{'col1':2013},{'col1':2015})
>>> key = lambda x: x['col1']
>>> tuple(tuple(g) for k, g in groupby(sorted(t, key=key), key))
(({'col1': 2013}, {'col1': 2013}), ({'col1': 2014}, {'col1': 2014}), ({'col1': 2015},))

groupby will group the consecutive elements with same key and return (key, iterable) tuples. Then each iterable is converted to tuple within generator expression which is given as a parameter to tuple .

Update : The above one-liner has O(n log n) time complexity since it sorts the data. With couple more lines the task can be completed O(n) time by utilizing defaultdict :

>>> from collections import defaultdict
>>> t = ({'col1':2014},{'col1':2013},{'col1':2014},{'col1':2013},{'col1':2015})
>>> dd = defaultdict(list)
>>> for d in t:
...     dd[d['col1']].append(d)
...
>>> tuple(tuple(v) for k, v in dd.items())
(({'col1': 2013}, {'col1': 2013}), ({'col1': 2014}, {'col1': 2014}),({'col1': 2015},))

Note that this will return the groups in arbitrary order since dict is unordered collection. If you need to process the data in "full" groups (only one group for each year) and you can't get the DB to return the data in sorted order this is the best you can do.

In case you can get the data from DB by batches in sorted order then you can still use groupby without needing to pull everything before:

from itertools import groupby

cursor = iter([2013, 2013, 2014, 2014, 2014, 2015, 2015])

def get_batch():
    batch = []
    try:
        for _ in range(3):
            batch.append({'col1': next(cursor)})
    except StopIteration:
        pass

    print('Got batch')
    return batch

def fetch():
    while True:
        batch = get_batch()
        if not batch:
            break

        yield from batch

for k, g in groupby(fetch(), lambda x: x['col1']):
    print('Group: {}'.format(tuple(g)))

Output:

Got batch
Group: ({'col1': 2013}, {'col1': 2013})
Got batch
Group: ({'col1': 2014}, {'col1': 2014}, {'col1': 2014})
Got batch
Got batch
Group: ({'col1': 2015}, {'col1': 2015})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM