简体   繁体   English

python-从数组中的单词中删除字符串

[英]python - remove string from words in an array

#!/usr/bin/python
#this looks for words in dictionary that begin with 'in' and the suffix is a real word
wordlist = [line.strip() for line in open('/usr/share/dict/words')]
newlist = []
for word in wordlist:
    if word.startswith("in"):
        newlist.append(word)
for word in newlist:
    word = word.split('in')
print newlist

how would I get the program to remove the string "in" from all the words that it starts with? 我如何让程序从开头的所有单词中删除字符串“ in”? right now it does not work 现在不起作用

#!/usr/bin/env python

# Look for all words beginning with 'in'
# such that the rest of the word is also
# a valid word.

# load the dictionary:
with open('/usr/share/dict/word') as inf:
    allWords = set(word.strip() for word in inf)  # one word per line
  1. using 'with' ensures the file is always properly closed; 使用'with'确保文件始终正确关闭;
  2. I make allWords a set; 我把所有单词组合在一起; this makes searching it an O(1) operation 这使得搜索它成为O(1)运算

then we can do 那我们可以做

# get the remainder of all words beginning with 'in'
inWords = [word[2:] for word in allWords if word.startswith("in")]
# filter to get just those which are valid words
inWords = [word for word in inWords if word in allWords]

or run it into a single statement, like 或将其运行到单个语句中,例如

inWords = [word for word in (word[2:] for word in allWords if word.startswith("in")) if word in allWords]

Doing it the second way also lets us use a generator for the inside loop, reducing memory requirements. 第二种方法还使我们可以将生成器用于内部循环,从而减少了内存需求。

split() returns a list of the segments obtained by splitting. split()返回通过分割获得的线段列表。 Furthermore, 此外,

word = word.split('in')

doesn't modify your list, it just modifies the variable being iterated. 不会修改您的列表,它只会修改要迭代的变量。

Try replacing your second loop with this: 尝试以此替换第二个循环:

for i in range(len(newlist)):
    word = newlist[i].split('in', 1)
    newlist[i] = word[1]

It's difficult to tell from your question what you want in newlist if you just want words that start with "in" but with "in" removed then you can use a slice : 从您的问题很难说出您想要在newlist想要什么,如果您只想要以“ in”开头但删除了“ in”的单词,则可以使用slice

newlist = [word[2:] for word in wordlist if word.startswith('in')]

If you want words that start with "in" are still in wordlist once they've had "in" removed (is that what you meant by "real" in your comment?) then you need something a little different: 如果您希望以“ in”开头的wordlist在删除了“ in”之后仍在wordlist中(这是您的注释中“ real”的意思吗?),那么您需要一些不同的东西:

newlist = [word for word in wordlist if word.startswith('in') and word[2:] in wordlist

Note that in Python we use a list , not an "array". 请注意,在Python中,我们使用list而不是“ array”。

Suppose that wordlist is the list of words. 假设wordlistwordlist列表。 Following code should do the trick: 以下代码可以解决问题:

for i in range(len(wordlist)):
    if wordlist[i].startswith("in"):
        wordlist[i] = wordlist[i][2:]

It is better to use while loop if the number of words in the list is quite big. 如果列表中的单词数量很大,最好使用while循环。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM