查找在列表中的单词中重复的一组字母

Question

I have a list of words:我有一个单词列表：

list1 = ['technology','technician','technical','technicality']

I want to check which phrase is repeated in each of the word.我想检查每个单词中重复哪个短语。 In this case, it is 'tech'.在这种情况下，它是“技术”。 I have tried converting all the characters to ascii values, but I am stuck there as I am unable to think of any logic.我尝试将所有字符转换为 ascii 值，但我被困在那里，因为我想不出任何逻辑。 Can somebody please help me with this?有人可以帮我吗？

Answer 1

This is generally called the Longest common substring/subsequence problem.这通常称为最长公共子串/子序列问题。

A very basic (but slow) strategy:一个非常基本（但很慢）的策略：

longest_substring = ""
curr_substring = ""

# Loop over a particular word (ideally, shortest).
for start_idx in range(shortest_word):

    # Select a substring from that word.
    for length in range(1, len(shortest_word) - start_idx):
        curr_substring = shortest_word[start_idx : start_idx + length]

        # Check if substring is present in all words,
        # and exit loop or update depending on outcome.

        if "curr_substring not in all words":
            break

        if "new string is longer":
            longest_substring = curr_substring

Answer 2

Iterate over first word, increase length of prefix if there is only one prefix in all words checked by set, when difference in prefix is found return last result迭代第一个单词，如果set检查的所有单词中只有一个前缀，则增加前缀长度，当发现前缀差异时返回最后一个结果

list1 = ['technology', 'technician', 'technical', 'technicality']


def common_prefix(li):
    s = set()
    word = li[0]
    while(len(s) < 2):
        old_s = s
        for i in range(1, len(word)):
            s.add(word[:i])
    return old_s.pop()


print(common_prefix(list1))

output: techn output：技术

Answer 3

Find the shortest word.找到最短的单词。 Iterate over increasingly small chunks of the first word, starting with a chunk equal in length to the shortest word, checking that each is contained in all of the other strings.迭代第一个单词的越来越小的块，从长度等于最短单词的块开始，检查每个单词是否包含在所有其他字符串中。 If it is, return that substring.如果是，则返回 substring。

list1 = ['technology', 'technician', 'technical', 'technicality']

def shortest_common_substring(lst):
    shortest_len = min(map(len, lst))
    shortest_word = next((w for w in lst if len(w) == shortest_len), None)
    
    for i in range(shortest_len, 1, -1):
        for j in range(0, shortest_len - i):
            substr = lst[0][j:i]
            
            if all(substr in w for w in lst[1:]):
                return substr

And just for fun, let's replace that loop with a generator expression, and just take the first thing it gives us (or None ).只是为了好玩，让我们用生成器表达式替换那个循环，然后取它给我们的第一件事（或None ）。

def shortest_common_substring(lst):
    shortest_len = min(map(len, lst))
    shortest_word = next((w for w in lst if len(w) == shortest_len), 0)
    
    return next((lst[0][j:i] for i in range(shortest_len, 1, -1)
                             for j in range(0, shortest_len - i)
                             if all(lst[0][j:i] in w for w in lst[1:])),
                None)

查找在列表中的单词中重复的一组字母

问题描述

3 个解决方案

解决方案1
0 2022-01-22 08:11:09

解决方案2
0 2022-01-22 08:25:02

解决方案3
0 2022-01-22 08:32:50

查找在列表中的单词中重复的一组字母

问题描述

3 个解决方案

解决方案1 0 2022-01-22 08:11:09

解决方案2 0 2022-01-22 08:25:02

解决方案3 0 2022-01-22 08:32:50

解决方案1
0 2022-01-22 08:11:09

解决方案2
0 2022-01-22 08:25:02

解决方案3
0 2022-01-22 08:32:50