簡體   English   中英

如何在Python映射的值中比較列表中的元素,並檢查是否至少有n個元素匹配?

[英]How to compare elements of a list in a Python map's value, and check to see if at least n number of elements match?

我想遍歷地圖的值並比較列表中的元素,以查看至少3個元素是否以相同順序匹配,然后返回一個列表,並返回與條件匹配的鍵。

prefs = {
        's1': ["a", "b", "c", "d", "e"],
        's2': ["c", "d", "e", "a", "b"],
        's3': ["a", "b", "c", "d", "e"],
        's4': ["c", "d", "e", "b", "e"],
        's5': ["c", "d", "e", "a", "b"]
    }

這是示例地圖。 在此示例中,鍵s1和s3在列表值中具有至少三個與“ a”,“ b”,“ c”匹配的元素。 因此s1和s3應該像s1-s3這樣返回。 類似地,s2和s4匹配,因此也應返回,但s2具有多個匹配項,因為它也與s5匹配,因此應返回s2-s5。 我想為列表中的每個鍵值對返回所有可能的匹配項。 返回輸出應類似於:

[[s1--s3], [s2--s4], [s2--s5], [s4--s5]]

我無法弄清楚如何遍歷映射中的每個值,但這是按元素進行比較的片段。 我想知道是否可以設置一個計數器,並檢查match_cnt> 3,然后返回列表中的鍵。

a = ["a", "b", "c", "d", "e"]
b = ["a", "c", "b", "d", "e"]
match_cnt = 0

if len(a) == len(b):
    for i in range(len(a)):
        if a[i] == b[i]:
            print(a[i], b[i])

此外,還需要一些有關此算法運行時的知識。 完整的代碼解決方案將不勝感激。 有人建議我在這里打開一個新問題

您可以使用.items()遍歷地圖,然后使用切片將其與前三個列表項匹配:

prefs = {
    's1': ["a", "b", "c", "d", "e"],
    's2': ["c", "d", "e", "a", "b"],
    's3': ["a", "b", "c", "d", "e"],
    's4': ["c", "d", "e", "b", "e"],
    's5': ["c", "d", "e", "a", "b"]
}

results = []
for ki, vi in prefs.items():
    for kj, vj in prefs.items():
        if ki == kj:  # skip checking same values on same keys !
            continue

        if vi[:3] == vj[:3]:  # slice the lists to test first 3 characters
            match = tuple(sorted([ki, kj]))  # sort results to eliminate duplicates
            results.append(match)

print (set(results))  # print a unique set

返回值:

set([('s1', 's3'), ('s4', 's5'), ('s2', 's5'), ('s2', 's4')])

編輯:
要檢查所有可能的組合,可以使用itertools中的combinations()。 iCombinations / jCombinations保留長度為3個列表項的訂單:

from itertools import combinations

prefs = {
    's1': ["a", "b", "c", "d", "e"],
    's2': ["c", "d", "e", "a", "b"],
    's3': ["a", "b", "c", "d", "e"],
    's4': ["c", "d", "e", "b", "e"],
    's5': ["c", "d", "e", "a", "b"]
}

results = []
for ki, vi in prefs.items():
    for kj, vj in prefs.items():
        if ki == kj:  # skip checking same values on same keys !
            continue

        # match pairs from start
        iCombinations = [vi[n:n+3] for n in range(len(vi)-2)]
        jCombinations = [vj[n:n+3] for n in range(len(vj)-2)]

        # match all possible combinations
        import itertools
        iCombinations = itertools.combinations(vi, 3)
        jCombinations = itertools.combinations(vj, 3)

        if any([ic in jCombinations for ic in iCombinations]):  # checking all combinations
            match = tuple(sorted([ki, kj]))
            results.append(match)

print (set(results))  # print a unique set

返回:

set([('s1', 's3'), ('s2', 's5'), ('s3', 's5'), ('s2', 's3'), ('s2', 's4'), ('s1', 's4'), ('s1', 's5'), ('s3', 's4'), ('s4', 's5'), ('s1', 's2')])

我試圖盡可能詳細。 這應該是一個示例,您可以通過插入大量print消息來創建正在發生的情況的日志,從而經常解決該問題。

prefs = {
    's1': ["a", "b", "c", "d", "e"],
    's2': ["c", "d", "e", "a", "b"],
    's3': ["a", "b", "c", "d", "e"],
    's4': ["c", "d", "e", "b", "e"],
    's5': ["c", "d", "e", "a", "b"]
}

# Get all items of prefs and sort them by key. (Sorting might not be
# necessary, that's something you'll have to decide.)
items_a = sorted(prefs.items(), key=lambda item: item[0])

# Make a copy of the items where we can delete the processed items.
items_b = items_a.copy()

# Set the length for each compared slice.
slice_length = 3

# Calculate how many comparisons will be necessary per item.
max_shift = len(items_a[0][1]) - slice_length

# Create an empty result list for all matches.
matches = []

# Loop all items
print("Comparisons:")
for key_a, value_a in items_a:
    # We don't want to check items against themselves, so we have to
    # delete the first item of items_b every loop pass (which would be
    # the same as key_a, value_a).
    del items_b[0]
    # Loop remaining other items
    for key_b, value_b in items_b:
        print("- Compare {} to {}".format(key_a, key_b))
        # We have to shift the compared slice
        for shift in range(max_shift + 1):
            # Start the slice at 0, then shift it
            start = 0 + shift
            # End the slice at slice_length, then shift it
            end = slice_length + shift
            # Create the slices
            slice_a = value_a[start:end]
            slice_b = value_b[start:end]
            print("  - Compare {} to {}".format(slice_a, slice_b), end="")
            if slice_a == slice_b:
                print(" -> Match!", end="")
                matches += [(key_a, key_b, shift)]
            print("")

print("Matches:")
for key_a, key_b, shift in matches:
    print("- At positions {} to {} ({} elements), {} matches with {}".format(
        shift + 1, shift + slice_length, slice_length, key_a, key_b))

哪些打印:

Comparisons:
- Compare s1 to s2
  - Compare ['a', 'b', 'c'] to ['c', 'd', 'e']
  - Compare ['b', 'c', 'd'] to ['d', 'e', 'a']
  - Compare ['c', 'd', 'e'] to ['e', 'a', 'b']
- Compare s1 to s3
  - Compare ['a', 'b', 'c'] to ['a', 'b', 'c'] -> Match!
  - Compare ['b', 'c', 'd'] to ['b', 'c', 'd'] -> Match!
  - Compare ['c', 'd', 'e'] to ['c', 'd', 'e'] -> Match!
- Compare s1 to s4
  - Compare ['a', 'b', 'c'] to ['c', 'd', 'e']
  - Compare ['b', 'c', 'd'] to ['d', 'e', 'b']
  - Compare ['c', 'd', 'e'] to ['e', 'b', 'e']
- Compare s1 to s5
  - Compare ['a', 'b', 'c'] to ['c', 'd', 'e']
  - Compare ['b', 'c', 'd'] to ['d', 'e', 'a']
  - Compare ['c', 'd', 'e'] to ['e', 'a', 'b']
- Compare s2 to s3
  - Compare ['c', 'd', 'e'] to ['a', 'b', 'c']
  - Compare ['d', 'e', 'a'] to ['b', 'c', 'd']
  - Compare ['e', 'a', 'b'] to ['c', 'd', 'e']
- Compare s2 to s4
  - Compare ['c', 'd', 'e'] to ['c', 'd', 'e'] -> Match!
  - Compare ['d', 'e', 'a'] to ['d', 'e', 'b']
  - Compare ['e', 'a', 'b'] to ['e', 'b', 'e']
- Compare s2 to s5
  - Compare ['c', 'd', 'e'] to ['c', 'd', 'e'] -> Match!
  - Compare ['d', 'e', 'a'] to ['d', 'e', 'a'] -> Match!
  - Compare ['e', 'a', 'b'] to ['e', 'a', 'b'] -> Match!
- Compare s3 to s4
  - Compare ['a', 'b', 'c'] to ['c', 'd', 'e']
  - Compare ['b', 'c', 'd'] to ['d', 'e', 'b']
  - Compare ['c', 'd', 'e'] to ['e', 'b', 'e']
- Compare s3 to s5
  - Compare ['a', 'b', 'c'] to ['c', 'd', 'e']
  - Compare ['b', 'c', 'd'] to ['d', 'e', 'a']
  - Compare ['c', 'd', 'e'] to ['e', 'a', 'b']
- Compare s4 to s5
  - Compare ['c', 'd', 'e'] to ['c', 'd', 'e'] -> Match!
  - Compare ['d', 'e', 'b'] to ['d', 'e', 'a']
  - Compare ['e', 'b', 'e'] to ['e', 'a', 'b']
Matches:
- At positions 1 to 3 (3 elements), s1 matches with s3
- At positions 2 to 4 (3 elements), s1 matches with s3
- At positions 3 to 5 (3 elements), s1 matches with s3
- At positions 1 to 3 (3 elements), s2 matches with s4
- At positions 1 to 3 (3 elements), s2 matches with s5
- At positions 2 to 4 (3 elements), s2 matches with s5
- At positions 3 to 5 (3 elements), s2 matches with s5
- At positions 1 to 3 (3 elements), s4 matches with s5

尚不清楚,您的輸出真正應該是什么。 但是,我認為將上述代碼轉換為您的需求不會有任何問題。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM