I'm using numpy/pandas on a 64-bit fedora box, in production they pushed to a 32-bit Centos box and hit an error with json.dumps
. It was throwing repr(0) is not Serializable
.
I tried testing on 64-bit Centos and it runs absolutely fine. But on 32-bit (Centos 6.8 to be precise) it throws an error. I was wondering if anyone has hit this issue before.
Below is 64-bit Fedora,
Python 2.6.6 (r266:84292, Jun 30 2016, 09:54:10)
[GCC 5.3.1 20160406 (Red Hat 5.3.1-6)] on linux4
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> >>> a = pd.DataFrame([{'a':1}])
>>>
>>> a
a
0 1
>>> a.to_dict()
{'a': {0: 1}}
>>> import json
>>> json.dumps(a.to_dict())
'{"a": {"0": 1}}'
Below is 32-bit Centos
import json
import pandas as pd
a = pd.DataFrame( [ {'a': 1} ] )
json.dumps(a.to_dict())
Traceback (most recent call last):
File "sample.py", line 5, in <module>
json.dumps(a.to_dict())
File "/usr/lib/python2.6/json/__init__.py", line 230, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.6/json/encoder.py", line 367, in encode
chunks = list(self.iterencode(o))
File "/usr/lib/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
File "/usr/lib/python2.6/json/encoder.py", line 275, in _iterencode_dict
for chunk in self._iterencode(value, markers):
File "/usr/lib/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
File "/usr/lib/python2.6/json/encoder.py", line 268, in _iterencode_dict
raise TypeError("key {0!r} is not a string".format(key))
TypeError: key 0 is not a string
What is the usual work around for this issue? I cannot use custom encoder for json as the library I'm using to push this data expects a dictionary and it internally uses json
module to serialize it and push it over the wire.
Update: Python version 2.6.6 on both and pandas is 0.16.1 on both
I believe this happens because the index is a numpy.intNN
of different size as the Python int
and these are not natively converted from one to another.
Like, on my 64-bit Python 2.7 and Numpy:
>>> isinstance(numpy.int64(5), int)
True
>>> isinstance(numpy.int32(5), int)
False
Then:
>>> json.dumps({numpy.int64(5): '5'})
'{"5": "5"}'
>>> json.dumps({numpy.int32(5): '5'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/json/__init__.py", line 243, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
return _iterencode(o, 0)
TypeError: keys must be a string
You could try to change the index to numpy.int32
, numpy.int64
or int
:
>>> df = pd.DataFrame( [ {'a': 1}, {'a': 2} ] )
>>> df.index = df.index.astype(numpy.int32) # perhaps your index was of these?
>>> json.dumps(df.to_dict())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/json/__init__.py", line 243, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
return _iterencode(o, 0)
TypeError: keys must be a string
So you can try changing the index type to int32
, int64
or just plain Python int
:
>>> df.index = df.index.astype(numpy.int64)
>>> json.dumps(df.to_dict())
'{"a": {"0": 1, "1": 2}}'
>>> df.index = df.index.astype(int)
>>> json.dumps(df.to_dict())
'{"a": {"0": 1, "1": 2}}'
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.