简体   繁体   English

Python 3.3:如何获取文本文件中的第5个单词?

[英]Python 3.3: How to grab every 5th word in a text file?

I'm trying to have my program grab every fifth word from a text file and place it in a single string. 我试图让我的程序从文本文件中抓取每五个单词并将其放在单个字符串中。 For instance, if I typed "Everyone likes to eat pie because it tastes so good plus it comes in many varieties such as blueberry strawberry and lime" then the program should print out "Everyone because plus varieties and." 例如,如果我键入“每个人都喜欢吃馅饼,因为它的味道好极了,而且它有很多品种,例如蓝莓草莓和酸橙”,那么程序应打印出“每个人都喜欢,因为它加上品种和”。 I must start with the very first word and grab every fifth word after. 我必须从第一个单词开始,然后再抓五个单词。 I'm confused on how to do this. 我对如何做到这一点感到困惑。 Below is my code, everything runs fine except the last 5 lines. 下面是我的代码,除了最后5行外,其他所有程序都运行正常。

#Prompt the user to enter a block of text.
done = False
textInput = ""
while(done == False):
    nextInput= input()
    if nextInput== "EOF":
        break
    else:
        textInput += nextInput

#Prompt the user to select an option from the Text Analyzer Menu.
print("Welcome to the Text Analyzer Menu! Select an option by typing a number"
    "\n1. shortest word"
    "\n2. longest word"
    "\n3. most common word"
    "\n4. left-column secret message!"
    "\n5. fifth-words secret message!"
    "\n6. word count"
    "\n7. quit")

#Set option to 0.
option = 0

#Use the 'while' to keep looping until the user types in Option 7.
while option !=7:
    option = int(input())

#I'm confused here. This is where I'm stuck. Is the 'for' loop correct for this `#situation?`
#If the user selects Option 5,
    elif option == 5:
        for i in textInput.split():
            if i <= 4 and i >= 6:
                print(textInput)

Using your method of defining words with str.split() , either of the following will do what you want: 使用通过str.split()定义单词的方法,可以执行以下任一操作:

textInput = """\
I'm trying to have my program grab every fifth word from a text file and
place it in a single string. For instance, if I typed "Everyone likes to
eat pie because it tastes so good plus it comes in many varieties such
as blueberry strawberry and lime" then the program should print out
"Everyone because plus varieties and." I must start with the very first
word and grab every fifth word after. I'm confused on how to do this.
Below is my code, everything runs fine except the last 5 lines."""

everyfive = ' '.join(word for i,word in enumerate(textInput.split()) if not i%5)

# or more succinctly
everyfive = ' '.join(textInput.split()[::5])

print(repr(everyfive))

Either way, the output will be: 无论哪种方式,输出都将是:

"I'm program from place string. typed pie good many strawberry program because 
 must first fifth on Below runs 5"

The shorter and (consequently much faster and simpler) version using the [::5] notation is based on something called "slicing", which all sequences support in Python. 使用[::5]表示法的更短(因此更快速,更简单)版本基于称为“切片”的东西,所有序列均在Python中支持。 The general concept is described in the documentation near the beginning of the Sequences section. 在“ 序列”部分开头附近的文档中描述了一般概念。

for i in textInput.split() loops over the words in textInput , not the indices. for i in textInput.split()循环遍历textInput的单词,而不是索引。 If you want both indices and words, you want 如果您既需要索引又需要单词,则需要

for i, word in enumerate(textInput.split()):

I don't know what the idea was behind i <= 4 and i >= 6 , since those conditions can't both be true. 我不知道i <= 4 and i >= 6背后的想法是什么,因为这些条件都不能成立。 If you want to pick every fifth word, you want 如果您想选择每五个单词,

if i % 5 == 0:

which checks if the remainder upon dividing i by 5 is 0 . 其检查是否在将所述余数i50

However, you don't need the if statement at all. 但是,您根本不需要if语句。 You can just slice the list given by split to get every 5th element: 您可以仅对split给定的列表进行切片以获取每第5个元素:

# Iterate over every 5th word in textInput.
for word in textInput.split()[::5]:
    print(word)

The output from split() is a list of words in the string. split()的输出是字符串中的单词列表。 eg: 例如:

>>> "The quick brown fox jumped over the lazy dog and then back again".split()
['The', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog', 'and',
'then', 'back', 'again']
>>>

Thus, to get every fifth word: 因此,要获得第五个单词:

>>> for i,s in enumerate("The quick brown fox jumped over the lazy dog and then
back again".split()):
...     if i%5 == 0: print (s)
...
jumped
and
>>>>

You can split the sentences with the spaces and then increment the array's index by 5 to get the desired outcome. 您可以使用空格分隔句子,然后将数组的索引增加5以得到所需的结果。

textInput = "Everyone likes to eat pie because it tastes so good plus it comes in many varieties such as blueberry strawberry and lime"
steps = 5
words = textInput.split()
for x in xrange(1, len(words), steps):
    print words[x]

#OUTOUT
Everyone
because
plus
varieties
and

Here's my basic solution. 这是我的基本解决方案。 I'm sure some will say it's not 'pythonic' but it gets the job done. 我确定有人会说这不是“ pythonic”,但可以完成工作。

someString = "Everyone likes to eat pie because it tastes so good plus it comes in many varieties such as blueberry strawberry and lime"
someList = someString.split()
loadString = ''
i = 0
for s in range(len(someList)):
    if i < len(someList) - 1:
        loadString += someList[i] + ' '
        i += 5
print loadString.rstrip(' ')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM