简体   繁体   中英

Create a dataframe from a dictionary with two pairs of keys and values

My dictionary look like this:

dic = {
'symbol': 'IFF',
'annualReports': [
{'Date': '2019', 'Currency': 'USD', 'Revenue': '514'},
{'Date': '2018', 'Currency': 'USD', 'Revenue': '256'},
{'Date': '2017', 'Currency': 'USD', 'Revenue': '256'}
]}

I would like to convert it into a dataframe and it has symbol and the first row of annualReports. The result look like this:

    symbol    Date     Currency     Revenue
0   IFF       2019     USD          514

I know how to convert to a dataframe from a single dictionary, the code like this:-

import pandas as pd

data = dic['annualReports'][0]
df = pd.DataFrame(data)

And the output is like this:

    Date     Currency     Revenue
0   2019     USD          514

Therefore, may I know how to add symbol into the dataframe?

How about:

pd.DataFrame(dict_['annualReports']).assign(symbol=dict_['symbol'])

Output:

   Date Currency Revenue symbol
0  2019      USD     514    IFF
1  2018      USD     256    IFF
2  2017      USD     256    IFF

You can try json_normalize :

pd.json_normalize(dct, 'annualReports', ['symbol'])

   Date Currency Revenue symbol
0  2019      USD     514    IFF
1  2018      USD     256    IFF
2  2017      USD     256    IFF

I tried explicitly:

import pandas as pd

dic = {'symbol': 'IFF',
       'annualReports': 
           [{'Date': '2019', 'Currency': 'USD', 'Revenue': '514'},
            {'Date': '2018', 'Currency': 'USD', 'Revenue': '256'},
            {'Date': '2017', 'Currency': 'USD', 'Revenue': '256'}
]}

ar_list = dic['annualReports'].copy()
for ar_dic in ar_list:
    ar_dic['symbol'] = dic['symbol']

df = pd.DataFrame(ar_list)
print(df)

This gives:

   Date Currency Revenue symbol
0  2019      USD     514    IFF
1  2018      USD     256    IFF
2  2017      USD     256    IFF

For this I would use alist comprehension and DataFrame.from_records() . I'm also assuming you have a list of dicts like the one you describe, dict_list = [d1, d2, ...] , with variable numbers of annualReports but only one symbol per list. The following should work:

record_list = [
  {
    'symbol': symbol_dict['symbol'],
    'Date': report['Date'],
    'Currency': report['Currency'],
    'Revenue': report['Revenue']
  }
  for symbol_dict in dict_list
  for report in symbol_dict['annualReports']
]
df = pd.DataFrame.from_records(record_list)

Also note you may want to do some type conversion / casting for the 'Revenue' and 'Date' fields. That's easily inserted into the list comprehension step.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM