特定字符串是否與文本文件中的字符串匹配

Question

我有一個包含許多單詞的文本文件（每行一個單詞）。 我必須閱讀每個單詞，修改單詞，然后檢查修改后的單詞是否與文件中的任何單詞匹配。 我在最后一部分上遇到了麻煩（這是我代碼中的hasMatch方法）。 聽起來很簡單，我知道該怎么做，但是無論我嘗試什么都行不通。

#read in textfile 
myFile = open('good_words.txt')


#function to remove first and last character in string, and reverse string
def modifyString(str):
    rmFirstLast = str[1:len(str)-2] #slicing first and last char
    reverseStr = rmFirstLast[::-1] #reverse string 
    return reverseStr

#go through list of words to determine if any string match modified string
def hasMatch(modifiedStr):
    for line in myFile:
        if line == modifiedStr:
            print(modifiedStr + " found")
        else:
            print(modifiedStr + "not found")

for line in myFile:
    word = str(line) #save string in line to a variable

    #only modify strings that are greater than length 3
    if len(word) >= 4:
        #global modifiedStr #make variable global
        modifiedStr = modifyString(word) #do string modification
        hasMatch(modifiedStr)

myFile.close()

Answer 1

這里有幾個問題

您必須剝離線條，否則會得到匹配失敗的換行/ CR字符
您必須一勞永逸地讀取文件，否則文件迭代器將在第一次后用完
速度很差：使用set而不是list來加快搜索速度
切片過於復雜和錯誤： str[1:-1]進行切片（感謝評論了我的答案的人）
整個代碼確實冗長而復雜。 我總結了幾行。

碼：

#read in textfile
myFile = open('good_words.txt')
# make a set (faster search), remove linefeeds
lines = set(x.strip() for x in myFile)
myFile.close()

# iterate on the lines
for word in lines:
    #only consider strings that are greater than length 3
    if len(word) >= 4:
        modifiedStr = word[1:-1][::-1] #do string modification
        if modifiedStr in lines:
            print(modifiedStr + " found (was "+word+")")
        else:
            print(modifiedStr + " not found")

我在常用英語單詞列表上測試了該程序，並找到了匹配項：

so found (was most)
or found (was from)
no found (was long)
on found (was know)
to found (was both)

編輯：刪除版本set並在已排序列表上使用bisect以避免哈希/哈希沖突的另一個版本。

import os,bisect

#read in textfile
myFile = open("good_words.txt"))
lines = sorted(x.strip() for x in myFile) # make a sorted list, remove linefeeds
myFile.close()

result=[]
for word in lines:

    #only modify strings that are greater than length 3
    if len(word) >= 4:
        modifiedStr = word[1:-1][::-1] #do string modification
        # search where to insert the modified word
        i=bisect.bisect_left(lines,modifiedStr)
        # if can be inserted and word is actually at this position: found
        if i<len(lines) and lines[i]==modifiedStr:
            print(modifiedStr + " found (was "+word+")")
        else:
            print(modifiedStr + " not found")

Answer 2

在您的代碼中，您不僅要切片第一個和最后一個字符，而且要切片第一個和最后兩個字符。

rmFirstLast = str[1:len(str)-2]

更改為：

rmFirstLast = str[1:len(str)-1]

特定字符串是否與文本文件中的字符串匹配

問題描述

2 個解決方案

解決方案1
2 已采納 2016-09-03 18:19:46

解決方案2
0 2016-09-03 18:23:57

特定字符串是否與文本文件中的字符串匹配

問題描述

2 個解決方案

解決方案1 2 已采納 2016-09-03 18:19:46

解決方案2 0 2016-09-03 18:23:57

解決方案1
2 已采納 2016-09-03 18:19:46

解決方案2
0 2016-09-03 18:23:57