简体   繁体   中英

Convert dictionary with named lists to pandas DataFrame

I have data coming from an API in the format below and I'd like to convert it to a tidy pandas DataFrame.

sample = '''{"rowHeaders":["Target","Month","Brand (TVEye)"],
                "colHeaders":["Units","Values"],
                "items":[["Adult",
                          "2019m1",
                          "1&1",
                          ["1+ (Item Reach)","8,8"],
                          ["2+ (Item Reach)","6,8"],
                          ["3+ (Item Reach)","2,6"],
                          ["4+ (Item Reach)","1,6"],
                          ["5+ (Item Reach)","0,9"],
                          ["6+ (Item Reach)","0,9"],
                          ["7+ (Item Reach)","0,1"],
                          ["8+ (Item Reach)","0,1"],
                          ["9+ (Item Reach)","0,0"],
                          ["10+ (Item Reach)","0,0"],
                          ["TVR (U/W)","21,8"]],
                        ["Adult",
                        "2019m2",
                        "1&1",
                        ["1+ (Item Reach)","11,1"],
                        ["2+ (Item Reach)","1,7"],
                        ["3+ (Item Reach)","0,4"],
                        ["4+ (Item Reach)","0,0"],
                        ["5+ (Item Reach)","0,0"],
                        ["6+ (Item Reach)","0,0"],
                        ["7+ (Item Reach)","0,0"],
                        ["8+ (Item Reach)","0,0"],
                        ["9+ (Item Reach)","0,0"],
                        ["10+ (Item Reach)","0,0"],
                        ["TVR (U/W)","13,2"]],
                        ["Adult",
                        "2019m3",
                        "1&1",
                        ["1+ (Item Reach)","5,3"],
                        ["2+ (Item Reach)","2,0"],
                        ["3+ (Item Reach)","0,0"],
                        ["4+ (Item Reach)","0,0"],
                        ["5+ (Item Reach)","0,0"],
                        ["6+ (Item Reach)","0,0"],
                        ["7+ (Item Reach)","0,0"],
                        ["8+ (Item Reach)","0,0"],
                        ["9+ (Item Reach)","0,0"],
                        ["10+ (Item Reach)","0,0"],
                        ["TVR (U/W)","7,3"]]]}'''

However, because of its weird format, none of the standard functions work and I haven't been able to make pretty much any progress at all.

How can I convert this dictionary to a tidy pandas DataFrame that looks something like below (sorry about numbers not lining up properly, but that's a table)?

Target  Month   Brand (TVEye)   1+ (Item Reach) 2+ (Item Reach) 3+ (Item Reach) 4+ (Item Reach) 5+ (Item Reach) 6+ (Item Reach) 7+ (Item Reach) 8+ (Item Reach) 9+ (Item Reach) 10+ (Item Reach)    TVR (U/W)
Adult   2019m1  1&1 8,8 6,8 2,6 1,6 0,9 0,9 0,1 0,1 0,0 0,0 21,8
Adult   2019m2  1&1 11,1 1,7 0,4 0,0 0,0 0,0 0,0 0,0 0,0 0,0 13,2
Adult   2019m3  1&1 5,3 2,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 7,3
import pandas as pd
import json

sample = '''{"rowHeaders":["Target","Month","Brand (TVEye)"],
                "colHeaders":["Units","Values"],
                "items":[["Adult",
                          "2019m1",
                          "1&1",
                          ["1+ (Item Reach)","8,8"],
                          ["2+ (Item Reach)","6,8"],
                          ["3+ (Item Reach)","2,6"],
                          ["4+ (Item Reach)","1,6"],
                          ["5+ (Item Reach)","0,9"],
                          ["6+ (Item Reach)","0,9"],
                          ["7+ (Item Reach)","0,1"],
                          ["8+ (Item Reach)","0,1"],
                          ["9+ (Item Reach)","0,0"],
                          ["10+ (Item Reach)","0,0"],
                          ["TVR (U/W)","21,8"]],
                        ["Adult",
                        "2019m2",
                        "1&1",
                        ["1+ (Item Reach)","11,1"],
                        ["2+ (Item Reach)","1,7"],
                        ["3+ (Item Reach)","0,4"],
                        ["4+ (Item Reach)","0,0"],
                        ["5+ (Item Reach)","0,0"],
                        ["6+ (Item Reach)","0,0"],
                        ["7+ (Item Reach)","0,0"],
                        ["8+ (Item Reach)","0,0"],
                        ["9+ (Item Reach)","0,0"],
                        ["10+ (Item Reach)","0,0"],
                        ["TVR (U/W)","13,2"]],
                        ["Adult",
                        "2019m3",
                        "1&1",
                        ["1+ (Item Reach)","5,3"],
                        ["2+ (Item Reach)","2,0"],
                        ["3+ (Item Reach)","0,0"],
                        ["4+ (Item Reach)","0,0"],
                        ["5+ (Item Reach)","0,0"],
                        ["6+ (Item Reach)","0,0"],
                        ["7+ (Item Reach)","0,0"],
                        ["8+ (Item Reach)","0,0"],
                        ["9+ (Item Reach)","0,0"],
                        ["10+ (Item Reach)","0,0"],
                        ["TVR (U/W)","7,3"]]]}'''

jsample = json.loads(sample)
df = pd.DataFrame.from_dict(jsample['items'])
df.columns = jsample['rowHeaders'] + df.iloc[0,3:].map(lambda x: x[0]).to_list()
df.iloc[:,3:] = df.iloc[:,3:].applymap(lambda x: x[1])
print(df)

Output:

  Target   Month Brand (TVEye) 1+ (Item Reach) 2+ (Item Reach) 3+ (Item Reach) 4+ (Item Reach) 5+ (Item Reach) 6+ (Item Reach) 7+ (Item Reach) 8+ (Item Reach) 9+ (Item Reach) 10+ (Item Reach) TVR (U/W)
0  Adult  2019m1           1&1             8,8             6,8             2,6             1,6             0,9             0,9             0,1             0,1             0,0              0,0      21,8
1  Adult  2019m2           1&1            11,1             1,7             0,4             0,0             0,0             0,0             0,0             0,0             0,0              0,0      13,2
2  Adult  2019m3           1&1             5,3             2,0             0,0             0,0             0,0             0,0             0,0             0,0             0,0              0,0       7,3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM