I understand that NaN is not allowed in JSON files. I usually use
import pandas as pd
pd.read_json('file.json')
to read in JSON into python. Looking through the documentation, I do not see an option to handle that value.
I have a JSON file, data.json, that looks like
[{"city": "Los Angeles","job":"chef","age":30},
{"city": "New York","job":"driver","age":35},
{"city": "San Jose","job":"pilot","age":NaN}]
How can I read this into python/pandas and handle the NaN values?
EDIT:
Amazing answer below!! Thanks fixxxer!! Just so it's documented, reading it in from a separate file
import pandas as pd
import json
text=open('data.json','r')
x=text.read()
y=json.loads(x)
data=pd.DataFrame(y)
data.head()
Read the json file into a variable:
x = '''[{"city": "Los Angeles","job":"chef","age":30}, {"city": "New York","job":"driver","age":35}, {"city": "San Jose","job":"pilot","age":NaN}]'''
Now, load it with json.loads
In [41]: import json
In [42]: y = json.loads(x)
In [43]: y
Out[43]:
[{u'age': 30, u'city': u'Los Angeles', u'job': u'chef'},
{u'age': 35, u'city': u'New York', u'job': u'driver'},
{u'age': nan, u'city': u'San Jose', u'job': u'pilot'}]
And,
In [44]: pd.DataFrame(y)
Out[44]:
age city job
0 30 Los Angeles chef
1 35 New York driver
2 NaN San Jose pilot
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.