简体   繁体   English

Python从列表中返回唯一的单词(不区分大小写)

[英]Python returning unique words from a list (case insensitive)

I need help with returning unique words (case insensitive) from a list in order. 我需要帮助从列表中按顺序返回唯一的单词(不区分大小写)。

For example: 例如:

def case_insensitive_unique_list(["We", "are", "one", "we", "are", "the", "world", "we", "are", "THE", "UNIVERSE"])

Will return: ["We", "are", "one", "the", "world", "UNIVERSE"] 将返回:[“我们”,“是”,“一个”,“该”,“世界”,“宇宙”]

So far this is what I've got: 到目前为止,这就是我所拥有的:

def case_insensitive_unique_list(list_string):

uppercase = ["A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"]
lowercase = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]

temp_unique_list = []

for i in list_string:
    if i not in list_string:
        temp_unique_list.append(i)

I am having trouble comparing every individual words from the temp_unique_list whether that word repeats itself or not. 我无法比较temp_unique_list中的每个单词,无论该单词是否重复。 For example: "to" and "To" (I am assuming range function will be useful) 例如:“to”和“To”(我假设范围函数会很有用)

And to make it return the word that comes first from the original list that function will take in. 并使它返回首先从函数将接受的原始列表中出现的单词。

How would I do this using the for loop ? 我怎么用for循环呢?

You can do this with the help of a for loop and set data structure, like this 您可以在for循环和set数据结构的帮助下完成此操作,如下所示

def case_insensitive_unique_list(data):
    seen, result = set(), []
    for item in data:
        if item.lower() not in seen:
            seen.add(item.lower())
            result.append(item)
    return result

Output 产量

['We', 'are', 'one', 'the', 'world', 'UNIVERSE']

You can use set() and a list comprehension: 您可以使用set()和列表理解:

>>> seen = set()
>>> lst = ["We", "are", "one", "we", "are", "the", "world", "we", "are", "THE", "UNIVERSE"]
>>> [x for x in lst if x.lower() not in seen and not seen.add(x.lower())]
['We', 'are', 'one', 'the', 'world', 'UNIVERSE']

You can do that as: 你可以这样做:

l = ["We", "are", "one", "we", "are", "the", "world", "we", "are", "THE", "UNIVERSE"]

a = []

for i in l:
    if i.lower() not in [j.lower() for j in a]:
        a.append(i)

>>> print a
['We', 'are', 'one', 'the', 'world', 'UNIVERSE']
l=["We", "are", "one", "we", "are", "the", "world", "we", "are", "THE", "UNIVERSE"]
so=[]
for w in l:
    if w.lower() not in so:
        so.append(w.lower())

In [14]: so
Out[14]: ['we', 'are', 'one', 'the', 'world', 'universe']

You can use a set to ensure uniqueness. 您可以使用一组来确保唯一性。 When you try to add a repeat item to a set it will simply discard it if it's already in there. 当您尝试将重复项添加到集合时,如果它已经在那里,它将简单地丢弃它。

You should also be using the in-built lower() function to manage the case-insensitivity. 您还应该使用内置的lower()函数来管理不区分大小写。

uniques = set()
for word in words:
    set.add(word.lower()) #lower it first and then add it

If this is for a homework task and using set is off limits, then you can easily adapt it to use lists only, just loop through and add the condition: 如果这是用于家庭作业任务并且使用set是禁止的,那么您可以轻松地将其调整为仅使用列表,只需循环并添加条件:

uniques = list()
if word.lower() not in uniques:
    #etc

You can use collections.OrderedDict like this. 你可以像这样使用collections.OrderedDict

from collections import OrderedDict
def case_insensitive_unique_list(data):
    d = OrderedDict()
    for word in data:
        d.setdefault(word.lower(), word)
    return d.values()

Output: 输出:

['We', 'are', 'one', 'the', 'world', 'UNIVERSE']

ok, removed my previous answer, as I misread the OP's post. 好的,删除了我以前的答案,因为我误读了OP的帖子。 All my apologies. 我所有的道歉。

As an excuse, for the fun of it and the sake of doing it in different ways, here's another solution, though it's neither the most efficient one, or the best: 作为借口,为了它的乐趣和以不同的方式做到这一点,这里是另一种解决方案,虽然它既不是最有效的,也不是最好的:

>>> from functools import reduce
>>> for it in reduce(lambda l,it: l if it in set({i.lower() for i in l}) else l+[it], lst, []):
...     print(it, end=", ")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从 django 的列表中提取不区分大小写的单词 - Extracting case insensitive words from a list in django 如何以不区分大小写的方式从列表中删除单词? - How do I remove words from a List in a case-insensitive manner? 在 python 的列表中计算不区分大小写的相同元素 - Count the same elements with insensitive case in a list in python 如何将不区分大小写的字符串与 python 中的 2 个列表进行比较 - How to compare a case insensitive string to 2 list in python Python从列表中返回唯一字符串 - Python returning unique strings from the list 正则表达式匹配:不区分大小写的带空格的德语单词(Python) - Regex matching: Case insensitive German words with spaces (Python) 如何通过不区分大小写的方式通过Python替换字符串中的多个单词? - How to replace multiple words in a string through Python in a case insensitive way? 从包含单词的文件中提取句子,不区分大小写 - extracting sentences from file that contain words, case insensitive 给定一个不同单词的列表,实现一个 function 返回包含所有元音的所有单词的列表(不区分大小写) - Given a list of different words, implement a function that returns the list of all words that contain all vowels (case insensitive) 唯一字段不区分大小写的约束 - Unique field case insensitive constraint
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM