[英]Remove words from list containing certain characters
I have a long list of words that I'm trying to go through and if the word contains a specific character remove it.我有一长串要检查的单词,如果该单词包含特定字符,请将其删除。 However, the solution I thought would work doesn't and doesn't remove any words
但是,我认为可行的解决方案并没有删除任何单词
l3 = ['b', 'd', 'e', 'f', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y']
firstcheck = ['poach', 'omnificent', 'aminoxylol', 'teetotaller', 'kyathos', 'toxaemic', 'herohead', 'desole', 'nincompoophood', 'dinamode']
validwords = []
for i in l3:
for x in firstchect:
if i not in x:
validwords.append(x)
continue
else:
break
If a word from firstcheck has a character from l3 I want it removed or not added to this other list.如果来自 firstcheck 的单词有来自 l3 的字符,我希望将其删除或不添加到其他列表中。 I tried it both ways.
我两种方法都试过了。 Can anyone offer insight on what could be going wrong?
任何人都可以提供有关可能出现问题的见解吗? I'm pretty sure I could use some list comprehension but I'm not very good at that.
我很确定我可以使用一些列表理解,但我不太擅长。
接受的答案使用np.sum
这意味着导入一个巨大的数字库来执行 Python 内核可以轻松完成的简单任务:
validwords = [w for w in firstcheck if all(c not in w for c in l3)]
you can use a list comprehension:您可以使用列表理解:
import numpy as np
[w for w in firstcheck if np.sum([c in w for c in l3])==0]
It seems all the words contain at least 1 char from l3 and the output of above is an empty list.似乎所有单词都包含来自 l3 的至少 1 个字符,上面的输出是一个空列表。
If firstcheck is defined as below:如果 firstcheck 定义如下:
firstcheck = ['a', 'z', 'poach', 'omnificent']
The code should output:代码应该输出:
['a', 'z']
Ah, there was some mistake in code, rest was fine:啊,代码有错误,休息一下就好了:
l3 = ['b', 'd', 'e', 'f', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y']
firstcheck = ['aza', 'ca', 'poach', 'omnificent', 'aminoxylol', 'teetotaller', 'kyathos', 'toxaemic', 'herohead', 'desole', 'nincompoophood', 'dinamode']
validwords = []
flag=1
for x in firstcheck:
for i in l3:
if i not in x:
flag=1
else:
flag=0
break
if(flag==1):
validwords.append(x)
print(validwords)
So, here the first mistake was, the for loops
, we need to iterate through words first then, through l3, to avoid the readdition of elements.所以,这里的第一个错误是
for loops
,我们需要先遍历单词,然后遍历 l3,以避免元素的重新添加。
Next, firstcheck spelling was wrong in 'for x in firstcheck` due to which error was there.接下来,由于存在错误,“for x in firstcheck”中的 firstcheck 拼写错误。
Also, I added a flag, such that if flag value is 1 it will add the element in validwords.此外,我添加了一个标志,如果标志值为 1,它将在有效字中添加元素。 To, check I added new elements as
'aza' and 'ca'
, due to which, now it shows correct o/p as 'aza' and 'ca'.为了,检查我添加了新元素作为
'aza' and 'ca'
,因此,现在它显示正确的 o/p 为 'aza' 和 'ca'。
Hope this helps you.希望这对你有帮助。
If you want to avoid all loops etc, you can use re
directly.如果你想避免所有循环等,你可以直接使用
re
。
import re
l3 = ['b', 'd', 'e', 'f', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y']
firstcheck = ['azz', 'poach', 'omnificent', 'aminoxylol', 'teetotaller', 'kyathos', 'toxaemic', 'herohead', 'desole', 'nincompoophood', 'dinamode']
# Create a regex string to remove.
strings_to_remove = "[{}]".format("".join(l3))
validwords = [x for x in firstcheck if re.sub(strings_to_remove, '', x) == x]
print(validwords)
Output:输出:
['azz']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.