简体   繁体   中英

How to optimize code for iterating over dict and storing values in lists?

I have this dictionary

example = {

'view_id_ga_standard_111': {'view_id': '111',
  'request_type': 'ga_standard',
  'start_date': '2019-07-01',
  'end_date': '2019-09-01',
  'status': 'New'},

'view_id_ga_standard_333': {'view_id': '333',
  'request_type': 'ga_standard',
  'start_date': '2019-07-01',
  'end_date': '2019-09-01',
  'status': 'New'},

'view_id_ga_corporate_222': {'view_id': '222',
  'request_type': 'ga_corporate',
  'start_date': '2018-07-01',
  'end_date': '2018-09-01',
  'status': 'New'}
}


And need to make a pandas df out of it, so it looks like this

    id  request_type start_date  end_date   request_id  status

2   111 ga_standard  2019-07-01 2019-09-01  1           New
3   333 ga_standard  2019-07-01 2019-09-01  2           New
5   222 ga_corporate 2018-07-01 2018-09-01  3           New

I have ended up with this function

def ga_make_request_types(params):
    vids = []
    rtypes = []
    sdates = []
    edates = []
    js = []
    statuses = []
    j = 0


    for k,v in data_for_config.items():
        j = j + 1
        view_id = v['view_id']
        vids.append(view_id)
        request_type = v['request_type']
        rtypes.append(request_type)
        start_date = v['start_date']
        sdates.append(start_date)
        end_date = v['end_date']
        edates.append(end_date)
        status = v['status']
        statuses.append(status)
        js.append(j)

    df = pd.DataFrame(zip(vids, rtypes, sdates, edates, js, statuses), columns=['id', 'request_type', 'start_date', 'end_date', 'request_id', 'status'])

    return df

But it is quite ugly, is it possible to shorten the code with list comprehensions?

I have tried like this

for k,v in data_for_config.items():
   vids = [v['view_id'] for v in k]

and like this

for k,v in data_for_config.items():
   vids = [v['view_id'] for v in data_for_config[k]]

But it throws an error

TypeError: string indices must be integers

How about:

df = pd.DataFrame(example).T

Output:

                            end_date  request_type  start_date status view_id
view_id_ga_corporate_222  2018-09-01  ga_corporate  2018-07-01    New     222
view_id_ga_standard_111   2019-09-01   ga_standard  2019-07-01    New     111
view_id_ga_standard_333   2019-09-01   ga_standard  2019-07-01    New     333

Edit:

I'm not sure how request_id is determined, but you could do something like:

df['request_id'] = range(1, df.shape[0] + 1)

Load the dict and Transpose DataFrame

pd.DataFrame(example).T.reset_index(drop=True).rename(columns={'view_id':'id'})

    id  request_type  start_date    end_date status
0  111   ga_standard  2019-07-01  2019-09-01    New
1  333   ga_standard  2019-07-01  2019-09-01    New
2  222  ga_corporate  2018-07-01  2018-09-01    New

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM