简体   繁体   中英

Restructuring JSON data with python

I have some JSON data I wish to reformat in order to be able to pass it to pandas/Dash to be able to create a stacked bar chart.

The json data can be seen here https://ghostbin.co/paste/zzm4w

The structure I'm after is below:

[
   [
      "2020-11-30",
      [
         "2",
         "Jira Server"
      ],
      [
         "1",
         "Jira DataCenter"
      ],
      [
         "7",
         "Jira Cloud"
      ],
      [
         "0",
         "Confluence Server"
      ],
      [
         "0",
         "Confluence DataCenter"
      ],
      [
         "3",
         "Confluence Cloud"
      ],
      [
         "0",
         "Bitbucket Cloud"
      ],
      [
         "0",
         "Bitbucket Server"
      ],
      [
         "0",
         "Bamboo"
      ]
   ],
   [
      "2020-12-01",
      [
         [
            "0",
            "Jira Server"
         ],
         [
            "2",
            "Jira DataCenter"
         ],
         [
            "6",
            "Jira Cloud"
         ],
         [
            "1",
            "Confluence Server"
         ],
         [
            "0",
            "Confluence DataCenter"
         ],
         [
            "0",
            "Confluence Cloud"
         ],
         [
            "0",
            "Bitbucket Cloud"
         ],
         [
            "0",
            "Bitbucket Server"
         ],
         [
            "0",
            "Bamboo"
         ]
      ]
   ]
]

I've written a function that takes in the json and then attempts to structure it this way, but I end up with the following, which has extra nested layers etc:

[
   ("2020-11-30",
   [
      [
         [
            "2",
            "Jira Server"
         ]
      ],
      [
         [
            "1",
            "Jira DataCenter"
         ]
      ],
      [
         [
            "7",
            "Jira Cloud"
         ]
      ],
      [
         [
            "0",
            "Confluence Server"
         ]
      ],
      [
         [
            "0",
            "Confluence DataCenter"
         ]
      ],
      [
         [
            "3",
            "Confluence Cloud"
         ]
      ],
      [
         [
            "0",
            "Bitbucket Cloud"
         ]
      ],
      [
         [
            "0",
            "Bitbucket Server"
         ]
      ],
      [
         [
            "0",
            "Bamboo"
         ]
      ]
   ]")",
   "(""2020-12-01",
   [
      [
         [
            "0",
            "Jira Server"
         ]
      ],
      [
         [
            "2",
            "Jira DataCenter"
         ]
      ],
      [
         [
            "6",
            "Jira Cloud"
         ]
      ],
      [
         [
            "1",
            "Confluence Server"
         ]
      ],
      [
         [
            "0",
            "Confluence DataCenter"
         ]
      ],
      [
         [
            "0",
            "Confluence Cloud"
         ]
      ],
      [
         [
            "0",
            "Bitbucket Cloud"
         ]
      ],
      [
         [
            "0",
            "Bitbucket Server"
         ]
      ],
      [
         [
            "0",
            "Bamboo"
         ]
      ]
   ])

Here's the code, it's messy because of the nested for loops, I'm only just starting to learn list comprehension which I feel would be a much tidier way of doing it. What would be the best approach to restructuring this data to the desired format?

import json

with open("data.json") as file:
    data = json.load(file)

def builder():
    final_list = []
    dates = [i[0] for i in data[0][1]]
    
    for date in dates:
        date_block = []
        for entry in data:
            hold = []
            for block in entry[1]:
                if date == block[0]:                  
                    obj = [block[1], entry[0]]
                    hold.append(obj)
            date_block.append(hold)
        final_obj = [date, date_block]
        final_list.append(final_obj)        
    print(final_list)
    return final_list   


builder()

First of all, the list obj was useless as hold stores the exact same data.
Secondly, final_obj was formed [date, date_block] and date_block was formed [[names, numbers], ...] . So final_obj == [date, [[names, numbers], ...] .
What you had to do is concatenate date with date_block to remove the unwanted nest to have [date, [[names, numbers], ...]] -> [date, [names, numbers], ...] :

import json
from pprint import pprint

with open("data.json") as file:
    data = json.load(file)

def builder():
    final_list = []
    dates = [i[0] for i in data[0][1]]

    for date in dates:
        date_block = []
        for entry in data:
            for block in entry[1]:
                if date == block[0]:
                    hold = [block[1], entry[0]]
            date_block.append(hold)
            # print(f"{hold=}")
        final_obj = [date] + date_block ### MAGIC LINE
        final_list.append(final_obj)
        # print(f"{final_obj=}")
    pprint(final_list)
    return final_list

builder()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM