This question can be complicated:-) and I wasn't able to find the answer for hours....
I have json type data from one of columns within a dataframe...
population postcode salesGrowthList
0 3507 2250 [{'medianSoldPrice': 300000.0, 'annualGrowth':...
1 3507 2250 [{'medianSoldPrice': 353000.0, 'annualGrowth':...
2 3507 2250 [{'medianSoldPrice': 0.0, 'annualGrowth': 0.0,...
3 3507 2250 [{'medianSoldPrice': 0.0, 'annualGrowth': 0.0,...
sameple out of 'salesGrowthList' is like below... it's a string format but it is Json structured string..
"[{'medianSoldPrice': 300000.0, 'annualGrowth': 0.0, 'numberSold': 19, 'year': 2014}, {'medianSoldPrice': 347000.0, 'annualGrowth': 0.15666666666666668, 'numberSold': 27, 'year': 2015}, {'medianSoldPrice': 371000.0, 'annualGrowth': 0.069164265129683, 'numberSold': 12, 'year': 2016}, {'medianSoldPrice': 410000.0, 'annualGrowth': 0.10512129380053908, 'numberSold': 15, 'year': 2017}, {'medianSoldPrice': 0.0, 'annualGrowth': 0.0, 'numberSold': 6, 'year': 2018}, {'medianSoldPrice': 411000.0, 'annualGrowth': 0.0, 'numberSold': 10, 'year': 2019}]"
Now I would like to build a new dataframe out of this output, how can this be done?
You could load json string using json
and then give it to the pandas.DataFrame
like,
>>> import json
>>> import pandas as pd
>>> x
"[{'medianSoldPrice': 300000.0, 'annualGrowth': 0.0, 'numberSold': 19, 'year': 2014}, {'medianSoldPrice': 347000.0, 'annualGrowth': 0.15666666666666668, 'numberSold': 27, 'year': 2015}, {'medianSoldPrice': 371000.0, 'annualGrowth': 0.069164265129683, 'numberSold': 12, 'year': 2016}, {'medianSoldPrice': 410000.0, 'annualGrowth': 0.10512129380053908, 'numberSold': 15, 'year': 2017}, {'medianSoldPrice': 0.0, 'annualGrowth': 0.0, 'numberSold': 6, 'year': 2018}, {'medianSoldPrice': 411000.0, 'annualGrowth': 0.0, 'numberSold': 10, 'year': 2019}]"
>>> d = json.loads(x.replace("'", '"'))
>>> df = pd.DataFrame(d)
>>> df
medianSoldPrice annualGrowth numberSold year
0 300000.0 0.000000 19 2014
1 347000.0 0.156667 27 2015
2 371000.0 0.069164 12 2016
3 410000.0 0.105121 15 2017
4 0.0 0.000000 6 2018
5 411000.0 0.000000 10 2019
>>>
and then maybe, add the column to the original dataframe like,
>>> orig_df['salesGrowthList'] = df
>>>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.