繁体   English   中英

Python; 计算两个 json 之间的差异百分比

[英]Python; calculate the percentage of difference between two json

我有一些本地 json 文件。 例如:

JSON 1

{
"events":{
  "id1":{
     "name":"Marcus",
     "surname":"Redwhite",
     "age":"22",
     "text":{
        "description":"Some description ...",
        "title":"title of description"
     }
   },
  "id2":{
     "name":"Fred",
     "surname":"Rose",
     "age":"30",
     "text":{
        "description":"Some description ...",
        "title":"title of description"
     }
   }
}

JSON 1 修改

{
"events":{
  "id1":{
     "name":"Marcus Modified",
     "surname":"Redwhite Modified",
     "age":"22",
     "text":{
        "description":"Some description ...",
        "title":"title of description Modified"
     }
   },
  "id2":{
     "name":"Fred",
     "surname":"Rose Modified",
     "age":"50",
     "text":{
        "description":"Some description ... Modified",
        "title":"title of description"
     }
   }
}

我必须比较这些 Json 文件(在此示例中,名称字段、姓字段、年龄字段和文本字段已修改)并且我必须计算它们之间的差异百分比(绘制饼图或任何其他图表)。 有没有办法做到这一点?

import json
import glob

# get list of All json files in different folders:
originalJsonFilesList = glob.glob("C:/Python/OriginalJson/*.json")
modifiedJsonFilesList = glob.glob("C:/Python/ModifiedJson/*.json")

# Loop all list
for originalfile, modifiedFile in originalJsonFilesList, modifiedJsonFilesList:
    
    # Opening JSON files (original and modified)
    originalJson = open(originalfile)
    modifiedJson = open(modifiedFile)
   
    # load as dictionary
    data1 = json.load(originalJson)
    data2 = json.load(modifiedJson)
  
    #############################################
    # Something for calculateing difference 
    # of percentage between data1 and data2
    ############################################
  
    # Closing files
    originalJson.close()
    modifiedJson.close()

第一的:

将您的 json 转换为 dict

dict_json1 = json.loads(json_1)

dict_json_modified = json.loads(json_modified)

第二:

将它们转换为设置:

dict_json1 = set(dict_json1.items())

dict_json_modified = set(dict_json_modified.items())

差异 = dict_json1 ^ dict_json_modified

打印(差异)

您需要实现一种称为Levenshtein Distance Metric的算法

还有一个问题与你的情况类似,有另一种解决方案,你可以看看。

您还可以从difflib检查 SequenceMatcher

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM