简体   繁体   English

实现集合协调算法

[英]Implementation of set reconciliation algorithm

I'm looking for implementations of set reconciliation algorithm. 我正在寻找集合协调算法的实现。 The problem is following: there are two sets with elements identified by some relatively compact value (eg UUID or MD5/SHA1/whatever hash) sitting on different machines. 问题在于:有两个集合,其中元素由位于不同机器上的一些相对紧凑的值(例如UUID或MD5 / SHA1 /任何散列)标识。 These sets differ in relatively few elements and I want to synchronize these sets while transferring minimal amount of data. 这些集合的元素相对较少,我希望在传输最少量的数据时同步这些集合。 Most of googling leads here . 大多数谷歌搜索引领这里 This is GPL'd implementation of what seems to be the state-of-art approach to the task. 这是GPL实施的似乎是最先进的任务方法。 The problem is that I can't use GPL'd code in my app. 问题是我不能在我的应用程序中使用GPL代码。 Most likely I'll have to reimplement it myself using something like nzmath, but maybe there are other implementations (preferably Python or C/C++), or maybe there are other nicer algorithms? 最有可能的是我必须使用类似nzmath的东西重新实现它,但也许还有其他实现(最好是Python或C / C ++),或者还有其他更好的算法?

Not being able to use GPL is often a matter of abstraction; 不能使用GPL通常是抽象的问题; that is if it is the license you have problems with. 如果它是您遇到问题的许可证。 So if you create a small GPL application (released under GPL) you can call this from your non-GPL application. 因此,如果您创建一个小型GPL应用程序(在GPL下发布),您可以从非GPL应用程序中调用它。 Why re-invent the wheel? 为什么重新发明轮子?

Especially if you can use a python script which already exists: why not leverage it? 特别是如果你可以使用已经存在的python脚本:为什么不利用它呢? Of course things are different if you can not expose the element reconsolidation algorithms. 当然,如果您不能公开元素重新整合算法,情况会有所不同。

This code is out of my head, and thus covered by whatever license applies for code samples in this site. 此代码不在我的脑海中,因此适用于此站点中代码示例的任何许可证。

# given two finite sequences of unique and hashable data,
# return needed opcodes and data needed for reconciliation

def set_reconcile(src_seq, dst_seq):
    "Return required operations to mutate src_seq into dst_seq"
    src_set= set(src_seq) # no-op if already of type set
    dst_set= set(dst_seq) # ditto

    for item in src_set - dst_set:
        yield 'delete', item

    for item in dst_set - src_set:
        yield 'create', item

Use as follows: 使用方法如下:

for opcode, datum in set_reconcile(machine1_stuff, machine2_stuff):
    if opcode == 'create':
        # act accordingly
    elif opcode == 'delete':
        # likewise
    else:
        raise RuntimeError, 'unexpected opcode'

Synchronizing Keyserver项目在OCaml中实现有效的集合协调。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM