I have data coming from an API in the format below and I'd like to convert it to a tidy pandas DataFrame.
sample = '''{"rowHeaders":["Target","Month","Brand (TVEye)"],
"colHeaders":["Units","Values"],
"items":[["Adult",
"2019m1",
"1&1",
["1+ (Item Reach)","8,8"],
["2+ (Item Reach)","6,8"],
["3+ (Item Reach)","2,6"],
["4+ (Item Reach)","1,6"],
["5+ (Item Reach)","0,9"],
["6+ (Item Reach)","0,9"],
["7+ (Item Reach)","0,1"],
["8+ (Item Reach)","0,1"],
["9+ (Item Reach)","0,0"],
["10+ (Item Reach)","0,0"],
["TVR (U/W)","21,8"]],
["Adult",
"2019m2",
"1&1",
["1+ (Item Reach)","11,1"],
["2+ (Item Reach)","1,7"],
["3+ (Item Reach)","0,4"],
["4+ (Item Reach)","0,0"],
["5+ (Item Reach)","0,0"],
["6+ (Item Reach)","0,0"],
["7+ (Item Reach)","0,0"],
["8+ (Item Reach)","0,0"],
["9+ (Item Reach)","0,0"],
["10+ (Item Reach)","0,0"],
["TVR (U/W)","13,2"]],
["Adult",
"2019m3",
"1&1",
["1+ (Item Reach)","5,3"],
["2+ (Item Reach)","2,0"],
["3+ (Item Reach)","0,0"],
["4+ (Item Reach)","0,0"],
["5+ (Item Reach)","0,0"],
["6+ (Item Reach)","0,0"],
["7+ (Item Reach)","0,0"],
["8+ (Item Reach)","0,0"],
["9+ (Item Reach)","0,0"],
["10+ (Item Reach)","0,0"],
["TVR (U/W)","7,3"]]]}'''
However, because of its weird format, none of the standard functions work and I haven't been able to make pretty much any progress at all.
How can I convert this dictionary to a tidy pandas DataFrame that looks something like below (sorry about numbers not lining up properly, but that's a table)?
Target Month Brand (TVEye) 1+ (Item Reach) 2+ (Item Reach) 3+ (Item Reach) 4+ (Item Reach) 5+ (Item Reach) 6+ (Item Reach) 7+ (Item Reach) 8+ (Item Reach) 9+ (Item Reach) 10+ (Item Reach) TVR (U/W)
Adult 2019m1 1&1 8,8 6,8 2,6 1,6 0,9 0,9 0,1 0,1 0,0 0,0 21,8
Adult 2019m2 1&1 11,1 1,7 0,4 0,0 0,0 0,0 0,0 0,0 0,0 0,0 13,2
Adult 2019m3 1&1 5,3 2,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 7,3
import pandas as pd
import json
sample = '''{"rowHeaders":["Target","Month","Brand (TVEye)"],
"colHeaders":["Units","Values"],
"items":[["Adult",
"2019m1",
"1&1",
["1+ (Item Reach)","8,8"],
["2+ (Item Reach)","6,8"],
["3+ (Item Reach)","2,6"],
["4+ (Item Reach)","1,6"],
["5+ (Item Reach)","0,9"],
["6+ (Item Reach)","0,9"],
["7+ (Item Reach)","0,1"],
["8+ (Item Reach)","0,1"],
["9+ (Item Reach)","0,0"],
["10+ (Item Reach)","0,0"],
["TVR (U/W)","21,8"]],
["Adult",
"2019m2",
"1&1",
["1+ (Item Reach)","11,1"],
["2+ (Item Reach)","1,7"],
["3+ (Item Reach)","0,4"],
["4+ (Item Reach)","0,0"],
["5+ (Item Reach)","0,0"],
["6+ (Item Reach)","0,0"],
["7+ (Item Reach)","0,0"],
["8+ (Item Reach)","0,0"],
["9+ (Item Reach)","0,0"],
["10+ (Item Reach)","0,0"],
["TVR (U/W)","13,2"]],
["Adult",
"2019m3",
"1&1",
["1+ (Item Reach)","5,3"],
["2+ (Item Reach)","2,0"],
["3+ (Item Reach)","0,0"],
["4+ (Item Reach)","0,0"],
["5+ (Item Reach)","0,0"],
["6+ (Item Reach)","0,0"],
["7+ (Item Reach)","0,0"],
["8+ (Item Reach)","0,0"],
["9+ (Item Reach)","0,0"],
["10+ (Item Reach)","0,0"],
["TVR (U/W)","7,3"]]]}'''
jsample = json.loads(sample)
df = pd.DataFrame.from_dict(jsample['items'])
df.columns = jsample['rowHeaders'] + df.iloc[0,3:].map(lambda x: x[0]).to_list()
df.iloc[:,3:] = df.iloc[:,3:].applymap(lambda x: x[1])
print(df)
Output:
Target Month Brand (TVEye) 1+ (Item Reach) 2+ (Item Reach) 3+ (Item Reach) 4+ (Item Reach) 5+ (Item Reach) 6+ (Item Reach) 7+ (Item Reach) 8+ (Item Reach) 9+ (Item Reach) 10+ (Item Reach) TVR (U/W)
0 Adult 2019m1 1&1 8,8 6,8 2,6 1,6 0,9 0,9 0,1 0,1 0,0 0,0 21,8
1 Adult 2019m2 1&1 11,1 1,7 0,4 0,0 0,0 0,0 0,0 0,0 0,0 0,0 13,2
2 Adult 2019m3 1&1 5,3 2,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 7,3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.