简体   繁体   中英

Levenshtein Distance-like algorithm for diffing in-memory objects?

Java 8 here, though this answer should apply to any lang.

I have a problem where I need to compare to objects, say, Widgets , and produce a "diff" between them: that is, a set of steps that, if followed, will transform one Widget (the source ) into the other (the target ).

class Widget {
    // Properties and such.
}

class WidgetDiffer extends Differ<Widget> {
    List<Transformation> diff(Widget source, Widget target) {
        // The produced list will convert source to target, if executed
        // by some runtime.
    }
}

class WidgetTransformer extends Transformer<Widget> {
    @Override
    Widget transformSourceToTarget(Widget source, List<Transformation> transforms) {
        // Somehow, run 'transforms' on 'source', which *should*
        // produce an object with the same state/properties as
        // the original target.
    }
}

I am aware of the Levenshtein Distance algorithm for string transformations, but:

  • That's just for strings, not Widgets ; and
  • It only gives you an integer (# of transformations required to turn the sink into the target), whereas I need a List<Transformation> that, when executed by some engine, turns source to target

I'm wondering if there are any known algorithms for doing this type of operations. Any chance these algorithms live in a library somewhere?!?

I see it as a search problem. Construct a graph where the destination node is the desired widget and the start node is the widget to be transformed. Each step (edge in the graph) represents one possible transformation to the widget (adding or removing properties). Once the graph is constructed run DFS with path-extraction on it and you will get the steps needed to transform the starting widget into the desired one (it will also be the minimum ammount of steps needed).

If the widgets are just key->value bags than the problem is pretty simple.

Here's a JavaScript (you could use it as pseudo code for the Java implementation) version.

function diff(src, target) {
  var result = [];
  for(var key in src) {
    if(key in target) { 
      if(src[key] !== target[key]) {
        result.push({op:"update", name:key, value:target[key]});
      }
    } else {
      result.push({op:"delete", name:key});
    }
  }
  for(var key in target) {
    if(!(key in src)) {
      result.push({op:"add", name:key, value:target[key]});
    }
  }
  return result;
}

console.log(JSON.stringify(diff({}, {a:1, b:2, c:3})));
console.log(JSON.stringify(diff({a:1, b:2, c:3}, {})));
console.log(JSON.stringify(diff({a:1, b:2, c:3}, {b:20, c:30, d:40})));

O(srcPropCount * lookupTargetProp + targetPropCount * lookupSrcPropCount)

The only operations are Add a new property, Update an existing property, and Delete a property.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM