简体   繁体   English

检测包含多个字符串的列表中的唯一术语

[英]Detecting Unique Terms In A List Contained of Multiple Strings

example = ["duran duran sang wild boys in 1984", "wild boys don't remain forever wild", "who brought wild flowers","it was john krakauer who wrote in to the wild"]

How do I detect unique terms and put them in a list like this:如何检测独特的术语并将它们放在这样的列表中:

['duran', 'sang', 'wild', 'boys', 'in', '1984', "don't", 'remain', 'forever', 'who', 'brought', 'flowers', 'it', 'was', 'john', 
'krakauer', 'wrote', 'to', 'the']

My code:我的代码:

def uniqueterms(a, d, e, f) :    
    b = a.split()
    c = [] `

    for x in b:
        if a.count(x) >= 1 and (x not in c):
            c.append(x)
    print((' '.join(c)).split(), end=' ')
    g = d.split()
    h = []

    for y in g:
        if d.count(y) >= 1 and (y not in h):
            h.append(y)
    print((' '.join(h)).split(), end=' ')
    i = e.split()
    j = []

    for z in i:
        if e.count(z) >= 1 and (z not in j):
            j.append(z)
    print((' '.join(j)).split(), end=' ')
    k = f.split()
    m = []

    for t in k:
        if f.count(t) >= 1 and (t not in m):
            m.append(t)
    print((' '.join(m)).split())

>>> uniqueterms(example[0], example[1], example[2], example[3])
['duran', 'sang', 'wild', 'boys', 'in', '1984'] ['wild', 'boys', "don't", 'remain', 'forever'] ['who', 'brought', 'wild', 'flowers'] ['it', 'was', 'john', 'krakauer', 'who', 'wrote', 'in', 'to', 'the', 'wild']

*Updated to return unique words in order of their appearance. *更新为按出现顺序返回唯一单词。 The previous version using python set() was not sensitive to input order:使用 python set() 的先前版本对输入顺序不敏感:

def get_unique_words(text):
    visited = set()
    uniq = []
    for word in text.split():
        if word not in visited:
            uniq.append(word)
            visited.add(word)
    return uniq

To handle a list of strings:处理字符串列表:

def get_unique_words_from_list_of_strings(str_list):
    return get_unique_words(' '.join(str_list))

To run your example:要运行您的示例:

words_in_order = get_unique_words_from_list_of_strings(example)

which returns返回

['duran', 'sang', 'wild', 'boys', 'in', '1984', "don't", 'remain', 'forever', 'who', 'brought', 'flowers', 'it', 'was', 'john', 'krakauer', 'wrote', 'to', 'the']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM