简体   繁体   English

将字符串列表展平为字符,然后对新列表进行重复数据删除

[英]Flatten a list of strings to characters and then de-dupe the new list

I have the following and have flattened the list via this documentation 我有以下内容,并已通过本文档整理了清单

>>> wordlist = ['cat','dog','rabbit']
>>> letterlist = [lt for wd in wordlist for lt in wd]
>>> print(letterlist)
['c', 'a', 't', 'd', 'o', 'g', 'r', 'a', 'b', 'b', 'i', 't']

Can the list comprehension be extended to remove duplicate characters. 可以扩展列表理解以删除重复的字符。 The desired result is the following (in any order): 期望的结果如下(以任意顺序):

['a', 'c', 'b', 'd', 'g', 'i', 'o', 'r', 't']

I can convert to a set and then back to a list but I'd prefer to keep it as a list. 我可以转换为集合,然后再返回列表,但我希望将其保留为列表。

Easiest is to use a set comprehension instead of a list comp: 最简单的方法是使用set comprehension而不是list comp:

letterlist = {lt for wd in wordlist for lt in wd}

All I did was replace the square brackets with curly braces. 我所做的就是用花括号替换方括号。 This works in Python 2.7 and up. 这适用于Python 2.7及更高版本。

For Python 2.6 and earlier, you'd use the set() callable with a generator expression instead: 对于Python 2.6和更早版本,您可以将set()与生成器表达式一起使用:

letterlist = set(lt for wd in wordlist for lt in wd)

Last, but not least, you can replace the comprehension syntax altogether by producing the letters from all sequences by chaining the strings together, treat them all like one long sequence, with itertools.chain.from_iterable() ; 最后,但并非最不重要的,你可以通过链接串在一起,从生产的所有序列的字母完全取代了理解语法,把他们都喜欢一个长序列, itertools.chain.from_iterable() ; you give that a sequence of sequences and it'll give you back one long sequence: 您给出一个序列序列,它将给您一个长序列:

from itertools import chain
letterlist = set(chain.from_iterable(wordlist))

Sets are an easy way to get unique elements from an iterable. 集是从迭代器中获取唯一元素的简单方法。 To flatten a list of lists, itertools.chain provides a handy way to do that. 要展平列表列表, itertools.chain提供了一种方便的方法。

from itertools import chain

>>> set(chain.from_iterable(['cat','dog','rabbit'])
{'a', 'b', 'c', 'd', 'g', 'i', 'o', 'r', 't'}

I think set comprehension should be used 我认为应该使用集合理解

wordlist = ['cat','dog','rabbit']
letterlist = {lt for wd in wordlist for lt in wd}
print(letterlist)

this will work only in python 2.7 and higher for previous versions use set instead of {} 这仅适用于python 2.7及更高版本,对于以前的版本,请使用set而不是{}

wordlist = ['cat','dog','rabbit']
letterlist = set(lt for wd in wordlist for lt in wd)
print(letterlist)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM