简体   繁体   中英

difflib to compare two python dictionaries

I need help trying to use difflib to compare two dicts. My program takes 2 json files, converts them to python dicts. Then I would like to use difflib on the two dicts to display the differences between the two.

Whats the correct way of using difflib to go about this?

#!/usr/bin/env python2

import json
import collections
import difflib
import pprint

def get_json():
    file_name = raw_input("Enter name of JSON File: ")
    with open(file_name) as json_file:
        json_data = json.load(json_file)
        return json_data

def convert(data):
    if isinstance(data, basestring):
        return str(data)
    elif isinstance(data, collections.Mapping):
        return dict(map(convert, data.iteritems()))
    elif isinstance(data, collections.Iterable):
        return type(data)(map(convert, data))
    else:
        return data

def main():
    json1 = get_json()
    json2 = get_json()
    json1_dict = convert(json1)
    json2_dict = convert(json2)
    result = list(difflib.Differ.compare(json1_dict, json2_dict))
    pprint.pprint(result)

if __name__ == "__main__":
    main()

json example:

{
    "glossary": {
        "title": "example glossary",
        "GlossDiv": {
            "title": "S",
            "GlossList": {
                "GlossEntry": {
                    "ID": "SGML",
                    "SortAs": "SGML",
                    "GlossTerm": "Standard Generalized Markup Language",
                    "Acronym": "SGML",
                    "Abbrev": "ISO 8879:1986",
                    "GlossDef": {
                        "para": "A meta-markup language, used to create markup languages such as DocBook.",
                        "GlossSeeAlso": [
                            "GML",
                            "XML"
                        ]
                    },
                    "GlossSee": "markup"
                }
            }
        }
    }
}

And change the value of ID to "1234" in a second file

I wanted to compare the two and get and output of something like:

{
    "glossary": {
        "title": "example glossary",
        "GlossDiv": {
            "title": "S",
            "GlossList": {
                "GlossEntry": {
-                   "ID": "SGML",
+                   "ID": "1234",
                    "SortAs": "SGML",
                    "GlossTerm": "Standard Generalized Markup Language",
                    "Acronym": "SGML",
                    "Abbrev": "ISO 8879:1986",
                    "GlossDef": {
                        "para": "A meta-markup language, used to create markup languages such as DocBook.",
                        "GlossSeeAlso": [
                            "GML",
                            "XML"
                        ]
                    },
                    "GlossSee": "markup"
                }
            }
        }
    }
}

You have a few issues here. First off, you're trying to use the method difflib.Differ.compare , but you're calling it as a plain function - you have not actually created a difflib.Differ object .

Second, this compare method expects you to operate upon a sequence of strings (for each of the two things being compared). Your convert function is sometimes returning strings, sometimes dicts, sometimes other stuff... in general, you're not getting back sequences of strings.

The natural way to get what you want is to just compare the actual JSON data, because that's a string. However, there are two issues there:

  • you want a sequence of strings (line-by-line) instead of a single string with the whole JSON document, but that's trivial - just split it up into lines with the string .splitlines method.

  • your input might have differences in whitespace that you want to ignore. The simple way around this is to, after load ing each JSON document into an object, re-create a string for it with dumps . The idea is that for both documents that you're comparing, you will dump with the same whitespace settings . You need to read the documentation and decide what settings you want to use.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM