简体   繁体   中英

Make Dataframe from dict

I have a data like this:

>>> cc
defaultdict(<class 'dict'>, {1540272960: {'max': 1.14614, 'to': 1540273020, 'close': 1.14606, 'from': 1540272960, 'open': 1.145935, 'volume': 96, 'id': 366597, 'min': 1.14593, 'at': 1540273020040554921}, 1540273020: {'active_id': 1, 'to': 1540273080, 'ask': 1.14622, 'open': 1.14606, 'max_at': 1540273034, 'size': 60, 'max': 1.146135, 'at': 1540273040013821491, 'min_at': 1540273020, 'close': 1.146095, 'from': 1540273020, 'volume': 42, 'bid': 1.14597, 'id': 366598, 'min': 1.14606}})

I tried to convert is into rows and columns format using pandas:

>>> df = pd.DataFrame(cc)
>>> df
             1540273080    1540273140
active_id  1.000000e+00  1.000000e+00
ask        1.146160e+00  1.146160e+00
at         1.540273e+18  1.540273e+18
bid        1.145910e+00  1.145910e+00
close      1.146035e+00  1.146035e+00
from       1.540273e+09  1.540273e+09
id         3.665990e+05  3.666000e+05
max        1.146100e+00  1.146055e+00
max_at     1.540273e+09  1.540273e+09
min        1.146030e+00  1.146035e+00
min_at     1.540273e+09  1.540273e+09
open       1.146080e+00  1.146040e+00
size       6.000000e+01  6.000000e+01
to         1.540273e+09  1.540273e+09
volume     9.500000e+01  9.000000e+00

I am getting this:

>>> df.index
Index(['active_id', 'ask', 'at', 'bid', 'close', 'from', 'id', 'max', 'max_at',
       'min', 'min_at', 'open', 'size', 'to', 'volume'],
      dtype='object')

and

>>> df["volume"]
Traceback (most recent call last):
  File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python35\lib\site-packages\pandas\core\indexes\base.py", line 3078, in get_loc
    return self._engine.get_loc(key)
  File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: 'volume'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python35\lib\site-packages\pandas\core\frame.py", line 2688, in __getitem__
    return self._getitem_column(key)
  File "C:\Python35\lib\site-packages\pandas\core\frame.py", line 2695, in _getitem_column
    return self._get_item_cache(key)
  File "C:\Python35\lib\site-packages\pandas\core\generic.py", line 2489, in _get_item_cache
    values = self._data.get(item)
  File "C:\Python35\lib\site-packages\pandas\core\internals.py", line 4115, in get
    loc = self.items.get_loc(item)
  File "C:\Python35\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: 'volume'

But the values are coming as verticle dataframe. I want that the index should the keys and the values must be placed accordingly in there respective columns. How I can I do that?

Use DataFrame.from_dict :

df = pd.DataFrame.from_dict(cc, orient='index')
print (df)
                 max          to     close        from      open  volume  \
1540272960  1.146140  1540273020  1.146060  1540272960  1.145935      96   
1540273020  1.146135  1540273080  1.146095  1540273020  1.146060      42   

                id      min                   at  active_id      ask  \
1540272960  366597  1.14593  1540273020040554921        NaN      NaN   
1540273020  366598  1.14606  1540273040013821491        1.0  1.14622   

                  max_at  size        min_at      bid  
1540272960           NaN   NaN           NaN      NaN  
1540273020  1.540273e+09  60.0  1.540273e+09  1.14597  

Another idea from @Anton vBR is use transpose by T :

df = pd.DataFrame(cc).T

Or similar to @jezrael's second one but using transopse :

df = pd.DataFrame(cc).transpose()

Then:

print(df)

Is:

                 max          to     close        from      open  volume  \
1540272960  1.146140  1540273020  1.146060  1540272960  1.145935      96   
1540273020  1.146135  1540273080  1.146095  1540273020  1.146060      42   

                id      min                   at  active_id      ask  \
1540272960  366597  1.14593  1540273020040554921        NaN      NaN   
1540273020  366598  1.14606  1540273040013821491        1.0  1.14622   

                  max_at  size        min_at      bid  
1540272960           NaN   NaN           NaN      NaN  
1540273020  1.540273e+09  60.0  1.540273e+09  1.14597  

As expected

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM