简体   繁体   中英

How to turn nested dictionary into pandas dataframe?

I have this nested dictionary:

{'attrs': ('LA', 'E', 'Can', 'AP', 'ME', 'A', 'M', 'Car', 'US'),
 'self': {'ac': {'AP', 'Can', 'Car', 'E', 'LA', 'M', 'ME', 'US'},
  'anz': {'AP', 'E', 'US'},
  'ana': {'AP', 'E', 'US'},
  'aa': {'AP'},
  'taag': {'A', 'AP', 'Can', 'E', 'ME', 'US'},
  'bm': {'E'},
  'l': {'A', 'AP', 'Can', 'E', 'LA', 'M', 'ME', 'US'},
  'm': {'Can', 'Car', 'LA', 'M', 'US'},
  'sca': {'A', 'AP', 'E', 'LA', 'US'},
  'sia': {'A', 'AP', 'Can', 'E', 'ME', 'US'},
  'tai': {'AP', 'Car', 'E', 'LA', 'US'},
  'ua': {'AP', 'Can', 'Car', 'E', 'LA', 'M', 'US'},
  'v': {'A', 'AP', 'E', 'LA', 'M', 'US'}}}

This is how I build it:

build_context = lambda objects, attributes, table : {'attrs' : tuple(attributes), 'self' : {object : {attributes[i] for i in range(len(row)) if row[i]} for (object, row) in zip(objects, table)}}


context = build_context(objects =
('ac', 'anz', 'ana', 'aa', 'taag', 'bm', 'l', 'm', 'sca', 'sia', 'tai',
'ua', 'v'),
attributes = ('LA', 'E', 'Can', 'AP', 'ME', 'A', 'M', 'Car', 'US'),
table = ((True,True,True,True,True,False,True,True,True),
(False,True,False,True,False,False,False,False,True), (False,True,False,True,False,False,False,False,True),
(False,False,False,True,False,False,False,False,False),
(False,True,True,True,True,True,False,False,True),
(False,True,False,False,False,False,False,False,False),
(True,True,True,True,True,True,True,False,True),
(True,False,True,False,False,False,True,True,True), (True,True,False,True,False,True,False,False,True),
(False,True,True,True,True,True,False,False,True),
(True,True,False,True,False,False,False,True,True),
(True,True,True,True,False,False,True,True,True),
(True,True,False,True,False,True,True,False,True)))

How to turn it into pandas dataframe? It should look like this, but i used abbreviations in my code:

在此处输入图像描述

Let us try explode then crosstab

s = pd.Series(d['self']).apply(list).explode()
out = pd.crosstab(s.index,s).reindex(columns=d['attrs'],fill_value=0)
out =out.rename_axis(None).rename_axis(None,axis=1).reset_index().rename(columns={'index':'company'})
Out[193]: 
   company  LA  E  Can  AP  ME  A  M  Car  US
0       aa   0  0    0   1   0  0  0    0   0
1       ac   1  1    1   1   1  0  1    1   1
2      ana   0  1    0   1   0  0  0    0   1
3      anz   0  1    0   1   0  0  0    0   1
4       bm   0  1    0   0   0  0  0    0   0
5        l   1  1    1   1   1  1  1    0   1
6        m   1  0    1   0   0  0  1    1   1
7      sca   1  1    0   1   0  1  0    0   1
8      sia   0  1    1   1   1  1  0    0   1
9     taag   0  1    1   1   1  1  0    0   1
10     tai   1  1    0   1   0  0  0    1   1
11      ua   1  1    1   1   0  0  1    1   1
12       v   1  1    0   1   0  1  1    0   1

Here is my solution:

attributes = ('LA', 'E', 'Can', 'AP', 'ME', 'A', 'M', 'Car', 'US')

data=d['self']

new_data=[]

for i in data:
    l={}
    for k in attributes:
        if k in data[i]:
            l[k]=1
        else:
            l[k]=0
    new_data.append(l)

res=pd.DataFrame_from_dict(new_data, orient='columns')

res['company']=data.keys()

res=res[['company', 'LA', 'E', 'Can', 'AP', 'ME', 'A', 'M', 'Car', 'US']]

print(res)

Output:

   company  LA  E  Can  AP  ME  A  M  Car  US
0       ac   1  1    1   1   1  0  1    1   1
1      anz   0  1    0   1   0  0  0    0   1
2      ana   0  1    0   1   0  0  0    0   1
3       aa   0  0    0   1   0  0  0    0   0
4     taag   0  1    1   1   1  1  0    0   1
5       bm   0  1    0   0   0  0  0    0   0
6        l   1  1    1   1   1  1  1    0   1
7        m   1  0    1   0   0  0  1    1   1
8      sca   1  1    0   1   0  1  0    0   1
9      sia   0  1    1   1   1  1  0    0   1
10     tai   1  1    0   1   0  0  0    1   1
11      ua   1  1    1   1   0  0  1    1   1
12       v   1  1    0   1   0  1  1    0   1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM