简体   繁体   中英

Python; calculate the percentage of difference between two json

i have some local json files. For example:

JSON 1

{
"events":{
  "id1":{
     "name":"Marcus",
     "surname":"Redwhite",
     "age":"22",
     "text":{
        "description":"Some description ...",
        "title":"title of description"
     }
   },
  "id2":{
     "name":"Fred",
     "surname":"Rose",
     "age":"30",
     "text":{
        "description":"Some description ...",
        "title":"title of description"
     }
   }
}

JSON 1 Modified

{
"events":{
  "id1":{
     "name":"Marcus Modified",
     "surname":"Redwhite Modified",
     "age":"22",
     "text":{
        "description":"Some description ...",
        "title":"title of description Modified"
     }
   },
  "id2":{
     "name":"Fred",
     "surname":"Rose Modified",
     "age":"50",
     "text":{
        "description":"Some description ... Modified",
        "title":"title of description"
     }
   }
}

I have to compare these Json files (in this example name field, surname field, age field and text field were modified) and i have to calculate the percentage of difference between them (drawing a pie chart or any other graph). Is there a way to do it?

import json
import glob

# get list of All json files in different folders:
originalJsonFilesList = glob.glob("C:/Python/OriginalJson/*.json")
modifiedJsonFilesList = glob.glob("C:/Python/ModifiedJson/*.json")

# Loop all list
for originalfile, modifiedFile in originalJsonFilesList, modifiedJsonFilesList:
    
    # Opening JSON files (original and modified)
    originalJson = open(originalfile)
    modifiedJson = open(modifiedFile)
   
    # load as dictionary
    data1 = json.load(originalJson)
    data2 = json.load(modifiedJson)
  
    #############################################
    # Something for calculateing difference 
    # of percentage between data1 and data2
    ############################################
  
    # Closing files
    originalJson.close()
    modifiedJson.close()

first:

convert your json to dict

dict_json1 = json.loads(json_1)

dict_json_modified = json.loads(json_modified)

second:

convert them to set:

dict_json1 = set(dict_json1.items())

dict_json_modified = set(dict_json_modified.items())

diff = dict_json1 ^ dict_json_modified

print (diff)

You need to implement an algorithm called Levenshtein Distance Metric

There's another question that is similar to your situation with an alternative solution, you can take a look at it.

You also can check SequenceMatcher from difflib

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM