[英]Python returning unique words from a list (case insensitive)
我需要帮助从列表中按顺序返回唯一的单词(不区分大小写)。
例如:
def case_insensitive_unique_list(["We", "are", "one", "we", "are", "the", "world", "we", "are", "THE", "UNIVERSE"])
将返回:[“我们”,“是”,“一个”,“该”,“世界”,“宇宙”]
到目前为止,这就是我所拥有的:
def case_insensitive_unique_list(list_string):
uppercase = ["A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"]
lowercase = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
temp_unique_list = []
for i in list_string:
if i not in list_string:
temp_unique_list.append(i)
我无法比较temp_unique_list中的每个单词,无论该单词是否重复。 例如:“to”和“To”(我假设范围函数会很有用)
并使它返回首先从函数将接受的原始列表中出现的单词。
我怎么用for循环呢?
您可以在for
循环和set
数据结构的帮助下完成此操作,如下所示
def case_insensitive_unique_list(data):
seen, result = set(), []
for item in data:
if item.lower() not in seen:
seen.add(item.lower())
result.append(item)
return result
产量
['We', 'are', 'one', 'the', 'world', 'UNIVERSE']
您可以使用set()
和列表理解:
>>> seen = set()
>>> lst = ["We", "are", "one", "we", "are", "the", "world", "we", "are", "THE", "UNIVERSE"]
>>> [x for x in lst if x.lower() not in seen and not seen.add(x.lower())]
['We', 'are', 'one', 'the', 'world', 'UNIVERSE']
你可以这样做:
l = ["We", "are", "one", "we", "are", "the", "world", "we", "are", "THE", "UNIVERSE"]
a = []
for i in l:
if i.lower() not in [j.lower() for j in a]:
a.append(i)
>>> print a
['We', 'are', 'one', 'the', 'world', 'UNIVERSE']
l=["We", "are", "one", "we", "are", "the", "world", "we", "are", "THE", "UNIVERSE"]
so=[]
for w in l:
if w.lower() not in so:
so.append(w.lower())
In [14]: so
Out[14]: ['we', 'are', 'one', 'the', 'world', 'universe']
您可以使用一组来确保唯一性。 当您尝试将重复项添加到集合时,如果它已经在那里,它将简单地丢弃它。
您还应该使用内置的lower()函数来管理不区分大小写。
uniques = set()
for word in words:
set.add(word.lower()) #lower it first and then add it
如果这是用于家庭作业任务并且使用set是禁止的,那么您可以轻松地将其调整为仅使用列表,只需循环并添加条件:
uniques = list()
if word.lower() not in uniques:
#etc
你可以像这样使用collections.OrderedDict
。
from collections import OrderedDict
def case_insensitive_unique_list(data):
d = OrderedDict()
for word in data:
d.setdefault(word.lower(), word)
return d.values()
输出:
['We', 'are', 'one', 'the', 'world', 'UNIVERSE']
好的,删除了我以前的答案,因为我误读了OP的帖子。 我所有的道歉。
作为借口,为了它的乐趣和以不同的方式做到这一点,这里是另一种解决方案,虽然它既不是最有效的,也不是最好的:
>>> from functools import reduce
>>> for it in reduce(lambda l,it: l if it in set({i.lower() for i in l}) else l+[it], lst, []):
... print(it, end=", ")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.