简体   繁体   中英

How to make a dictionary the value component or a numpy array

I am a new Python 2.7 user. I recently learned about numpy arrays, and now I am now just learning about dictionaries. Please excuse me if my syntax is not correct.

Let's say we have a dictionary:

dict1 = {'Ann': {'dogs': '3', 'cats': '4'},
         'Bob': {'dogs': '5', 'cats': '6'},
         'Chris': {'dogs': '7', 'cats': '8'},
         'Dan': {'dogs': '9', 'cats': '10'}}

The keys are dog and cat and the values are the numbers of each Ann, Bob, Chris, and Dan have.

I want to inverse the value component of my dictionary. I know I can convert to a list by using dict1.values() , and then convert to an array, and then convert back to a dictionary, but this seems tedious. Is there a way to make my value component a numpy array and leave the key component the way it is?

If you just need the values as arrays you can use pandas to help convert to a numpy array. Alternatively, you can just use pandas to meet your requirements. Pandas provides a data analysis library (think programmatic spreadsheet) that is built on top of numpy .

To convert to a numpy array for further processing:

>>> import pandas as pd
>>> import numpy as np
>>> pd.DataFrame(dict1).T
      cats dogs
Ann      4    3
Bob      6    5
Chris    8    7
Dan     10    9
>>> pd.DataFrame(dict1).T.as_matrix()
array([['4', '3'],
       ['6', '5'],
       ['8', '7'],
       ['10', '9']], dtype=object)

Updated based on comments, to invert all the values using pandas:

>>> pd.DataFrame(dict1).applymap(lambda x: 1/float(x))
           Ann       Bob     Chris       Dan
cats  0.250000  0.166667  0.125000  0.100000
dogs  0.333333  0.200000  0.142857  0.111111

Or result in a dictionary:

>>> pd.DataFrame(dict1).applymap(lambda x: 1/float(x)).to_dict()
{'Ann': {'cats': 0.25, 'dogs': 0.33333333333333331},
 'Bob': {'cats': 0.16666666666666666, 'dogs': 0.20000000000000001},
 'Chris': {'cats': 0.125, 'dogs': 0.14285714285714285},
 'Dan': {'cats': 0.10000000000000001, 'dogs': 0.1111111111111111}}

Based on your question and comments I think you just want the same dictionary structure, but with the numbers inverted:

dict1 = {'Ann': {'dogs': '3', 'cats': '4'},
         'Bob': {'dogs': '5', 'cats': '6'},
         'Chris': {'dogs': '7', 'cats': '8'},
         'Dan': {'dogs': '9', 'cats': '10'}}

for k in dict1.keys():
    value = dict1[k]
    for k1 in value.keys():
        value[k1] = 1/float(value[k1])

dict1
Out[64]: 
{'Ann': {'cats': 0.25, 'dogs': 0.3333333333333333},
 'Bob': {'cats': 0.16666666666666666, 'dogs': 0.2},
 'Chris': {'cats': 0.125, 'dogs': 0.14285714285714285},
 'Dan': {'cats': 0.1, 'dogs': 0.1111111111111111}}

I modified the dictionary in place, just replacing the numeric strings with their inverse, eg '4' with 0.25 .

Iterating on two levels of keys() is in a sense, tedious, but it's the straight forward thing to do when working with nested dictionaries. I wrote the for expression in one trial - no errors. I am experienced, but still I usually have to try several things before getting something that works. I iterated on keys so I could easily change the values in place. If I wanted to make a copy, I probably could have written it as a nested dict comprehension, but it would be more obscure.

Provided it does the right thing, it's faster than anything involving numpy or pandas . Creating the arrays takes time.

================

A numpy approach - much more advanced coding (display from a ipython session):

In [65]: dict1 = {'Ann': {'dogs': '3', 'cats': '4'},
    ...:          'Bob': {'dogs': '5', 'cats': '6'},
    ...:          'Chris': {'dogs': '7', 'cats': '8'},
    ...:          'Dan': {'dogs': '9', 'cats': '10'}}

In [66]: dt = np.dtype([('name','U5'),('dogs',float),('cats',float)])
# define a structured array dtype.

In [67]: def foo(k,v):
    ...:     return (k, v['dogs'], v['cats'])
# define a helper function - just helps organize my thoughts better 

In [68]: alist=[foo(k,v) for k,v in dict1.items()]

In [69]: alist
Out[69]: [('Chris', '7', '8'), ('Bob', '5', '6'), ('Dan', '9', '10'), ('Ann', '3', '4')]
# this is a list of tuples - a critical format for the next step    

In [70]: arr = np.array(alist, dtype=dt)

In [71]: arr
Out[71]: 
array([('Chris', 7.0, 8.0), 
       ('Bob', 5.0, 6.0), 
       ('Dan', 9.0, 10.0),
       ('Ann', 3.0, 4.0)], 
      dtype=[('name', '<U5'), ('dogs', '<f8'), ('cats', '<f8')])

I've converted the dictionary to a structured array, with 3 fields. This is similar to what I'd get from reading a csv file like:

name, dogs, cats
Ann, 3, 4
Bob, 5, 6
....

The dogs and cats fields are numeric, so I can invert their values

In [72]: arr['dogs']=1/arr['dogs']
In [73]: arr['cats']=1/arr['cats']

In [74]: arr
Out[74]: 
array([('Chris', 0.14285714285714285, 0.125),
       ('Bob', 0.2, 0.16666666666666666), 
       ('Dan', 0.1111111111111111, 0.1),
       ('Ann', 0.3333333333333333, 0.25)], 
      dtype=[('name', '<U5'), ('dogs', '<f8'), ('cats', '<f8')])

The result is the same numbers as in the dictionary case, but in a table layout.

======================

A dictionary comprehension version - same double dictionary iteration as the first solution, but building a new dictionary rather than making changes in place:

In [78]: {k1:{k2:1/float(v2) for k2,v2 in v1.items()} for k1,v1 in dict1.items()}
Out[78]: 
{'Ann': {'cats': 0.25, 'dogs': 0.3333333333333333},
 'Bob': {'cats': 0.16666666666666666, 'dogs': 0.2},
 'Chris': {'cats': 0.125, 'dogs': 0.14285714285714285},
 'Dan': {'cats': 0.1, 'dogs': 0.1111111111111111}}

===================

When the numeric values are in an array, it is possible to take the numeric inverse of all the values at once. That's the beauty of numpy . But getting there can require some advance numpy coding.

For example I could take the 2 numeric fields of arr , and view them as a 2d array:

In [80]: arr[['dogs','cats']].view('(2,)float')
Out[80]: 
array([[ 0.14285714,  0.125     ],
       [ 0.2       ,  0.16666667],
       [ 0.11111111,  0.1       ],
       [ 0.33333333,  0.25      ]])

In [81]: 1/arr[['dogs','cats']].view('(2,)float')
Out[81]: 
array([[  7.,   8.],
       [  5.,   6.],
       [  9.,  10.],
       [  3.,   4.]])

Getting back the original numbers (without the name labels).

Inverting values in the dictionary

"I want each of the values for dogs and cats to be the inverse meaning 1/3, 1/5, 1/7, 1/9, etc."

>>> {name:{key:1./float(value) for key,value in d.items()} for name,d in dict1.items()} 
{'Ann': {'cats': 0.25, 'dogs': 0.3333},
 'Bob': {'cats': 0.1667, 'dogs': 0.2},
 'Chris': {'cats': 0.125, 'dogs': 0.1429},
 'Dan': {'cats': 0.1, 'dogs': 0.1111}}

Or, keeping the values as strings:

>>> {name:{key:'1/' + value for key,value in d.items()} for name,d in dict1.items()}
{'Ann': {'cats': '1/4', 'dogs': '1/3'},
 'Bob': {'cats': '1/6', 'dogs': '1/5'},
 'Chris': {'cats': '1/8', 'dogs': '1/7'},
 'Dan': {'cats': '1/10', 'dogs': '1/9'}}

Converting dict1 to a numpy array

Let's import numpy and define your dictionary:

>>> import numpy as np
>>> dict1 = {'Ann': {'dogs': '3', 'cats': '4'},
...   'Bob': {'dogs': '5', 'cats': '6'},
...   'Chris': {'dogs': '7', 'cats': '8'},
...   'Dan': {'dogs': '9', 'cats': '10'},}

Now, let's convert your dictionary to a numpy array:

>>> np.array([[name]+[dict1[name][k] for k in 'dogs', 'cats'] for name in dict1])
array([['Chris', '7', '8'],
       ['Ann', '3', '4'],
       ['Dan', '9', '10'],
       ['Bob', '5', '6']], 
      dtype='|S5')

Here, the first column is the name, the second is the number of dogs and the third is the number of cats.

I know you said in the comments you weren't ready to start learning pandas but it would be quite a nice way to work with this data rather than a dictionary of dictionaries.

Pandas has some nice built in functionality for constructing data frames from dictionaries. Once in a Pandas DataFrame, it's quite easy to convert the string values to integers and then do the arithmetic.

In [1]: import pandas as pd

In [2]: dict1 = {'Ann': {'dogs': '3', 'cats': '4'},
   ...:          'Bob': {'dogs': '5', 'cats': '6'},
   ...:          'Chris': {'dogs': '7', 'cats': '8'},
   ...:          'Dan': {'dogs': '9', 'cats': '10'}}

In [3]: df = pd.DataFrame(dict1)

In [4]: df
Out[4]: 
     Ann Bob Chris Dan
cats   4   6     8  10
dogs   3   5     7   9

In [5]: df.values
Out[5]: 
array([['4', '6', '8', '10'],
       ['3', '5', '7', '9']], dtype=object)

In [6]: df.applymap(int)
Out[6]: 
      Ann  Bob  Chris  Dan
cats    4    6      8   10
dogs    3    5      7    9

In [7]: df = 1.0/df.applymap(int)

In [8]: df
Out[8]: 
           Ann       Bob     Chris       Dan
cats  0.250000  0.166667  0.125000  0.100000
dogs  0.333333  0.200000  0.142857  0.111111

In [10]: df.to_dict()
Out[10]: 
{'Ann': {'cats': 0.25, 'dogs': 0.33333333333333331},
 'Bob': {'cats': 0.16666666666666666, 'dogs': 0.20000000000000001},
 'Chris': {'cats': 0.125, 'dogs': 0.14285714285714285},
 'Dan': {'cats': 0.10000000000000001, 'dogs': 0.1111111111111111}}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM