在沒有標點符號的 .txt 文件中查找最長的單詞

Question

我正在做 Python 文件 I/O 練習，盡管在我嘗試在.txt文件的每一行中查找最長單詞的練習中取得了巨大進展，但我無法擺脫標點符號。

這是我的代碼：

with open("original-3.txt", 'r') as file1:
lines = file1.readlines()
for line in lines:
    if not line == "\n":
        print(max(line.split(), key=len))

這是我得到的 output

這是我從中讀取數據的original-3.txt文件

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

"Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!"

He took his vorpal sword in hand:
Long time the manxome foe he sought,
So rested he by the Tumtum tree,
And stood a while in thought.

And, as in uffish thought he stood,
The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
And burbled as it came!

One two! One two! And through and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.

"And hast thou slain the Jabberwock?
Come to my arms, my beamish boy!"
"Oh frabjous day! Callooh! Callay!"
He chortled in his joy.

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,
And the mome raths outgrabe.

如您所見，我得到了["," ";" "?" "!"]之類的標點符號["," ";" "?" "!"]

你怎么認為我只能得到這些詞本身？

謝謝

Answer 1

使用正則表達式很容易得到length of longest word ：

import re

for line in lines:
    found_strings = re.findall(r'\w+', line)
    print(max([len(txt) for txt in found_strings]))

Answer 2

您必須從單詞中strip這些字符：

with open("original-3.txt", 'r') as file1:
    lines = file1.readlines()
for line in lines:
    if not line == "\n":
        print(max(word.strip(",?;!\"") for word in line.split()), key=len))

或者您使用正則表達式來提取看起來像單詞的所有內容（即由字母組成）：

import re


for line in lines: 
    words = re.findall(r"\w+", line) 
    if words: 
        print(max(words, key=len))

Answer 3

此解決方案不使用正則表達式。 它將行拆分為單詞，然后對每個單詞進行清理，使其僅包含字母字符。

with open("original-3.txt", 'r') as file1:
    lines = file1.readlines()
    for line in lines:
        if not line == "\n":
            words = line.split()
            for i, word in enumerate(words):
                words[i] = "".join([letter for letter in word if letter.isalpha()])
            print(max(words, key=len))

在沒有標點符號的 .txt 文件中查找最長的單詞

問題描述

3 個解決方案

解決方案1
2 2020-05-30 13:48:53

解決方案2
1 已采納 2020-05-30 13:23:30

解決方案3
1 2020-05-30 14:48:28

在沒有標點符號的 .txt 文件中查找最長的單詞

問題描述

3 個解決方案

解決方案1 2 2020-05-30 13:48:53

解決方案2 1 已采納 2020-05-30 13:23:30

解決方案3 1 2020-05-30 14:48:28

解決方案1
2 2020-05-30 13:48:53

解決方案2
1 已采納 2020-05-30 13:23:30

解決方案3
1 2020-05-30 14:48:28