简体   繁体   中英

“diff -u -B -w” in python?

Using Python, I'd like to output the difference between two strings as a unified diff (-u) while, optionally, ignoring blank lines (-B) and spaces (-w).

Since the strings were generated internally, I'd prefer to not deal with nuanced complexity of writing one or both strings to a file, running GNU diff, fixing up the output, and finally cleaning up.

While difflib.unified_diff generates unified diffs it doesn't seem to let me tweak how spaces and blank lines are handled. I've looked at its implementation and, I suspect, the only solution is to copy/hack that function's body.

Is there anything better?

For the moment I'm stripping the pad characters using something like:

import difflib
import re
import sys

l = "line 1\nline 2\nline 3\n"
r = "\nline 1\n\nline 2\nline3\n"
strip_spaces = True
strip_blank_lines = True

if strip_spaces:
    l = re.sub(r"[ \t]+", r"", l)
    r = re.sub(r"[ \t]+", r"", r)
if strip_blank_lines:
    l = re.sub(r"^\n", r"", re.sub(r"\n+", r"\n", l))
    r = re.sub(r"^\n", r"", re.sub(r"\n+", r"\n", r))
# run diff
diff = difflib.unified_diff(l.splitlines(keepends=True), r.splitlines(keepends=True))
sys.stdout.writelines(list(diff))

which, of course, results in the output for a diff of something something other than the original input. For instance, pass the above text to GNU diff 3.3 run as "diff -u -w" and "line 3" is displayed as part of the context, the above would display "line3".

制作您自己的SequenceMatcher ,复制unified_diff正文并用您自己的匹配器替换SequenceMatcher

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM