[英]search each text file words in another main text file and append if not found in main file using python
在以下情況下,我需要有關python代碼的幫助。
我有兩個文本文件。 一個主文件和一個列表文件。 主文件包含許多單詞,當我從列表文件中找到新單詞時需要更新。
我需要在主文件中搜索列表文件的每個單詞。 如果在主文件中找不到任何單詞,那么我需要在主文件中附加該新單詞。
我有代碼,如果找不到字符串,它將更新文件。 但是,我需要從文本文件中搜索每個單詞。
Main_File = "file path"
list_file="file path"
with open("Main_File", "r+") as file:
for line in file:
if needle in line:
break
else: # not found, we are at the eof
file.write(needle) # append missing data
#this code will append if specific word not found in file.. but,i need to search each word from another file.
如果可以將主文件中的單詞加載到內存中,則可以加載set中的單詞,並檢查該單詞是否在主文件中,如下面的sudo代碼所示
main_file_words = set("load words from your main file".split())
list_file = # read list file
for word in list_file:
if word not in main_file_words:
main_file_words.add(word)
list_file.write(word)
您可以使用mmap加載mainFile並從列表文件中搜索單詞,如下所示:
import mmap
mainFilePath= "mainFile.txt"
listFilePath= "listFile.txt"
newWords=[]
# open main file with mmap
with open(mainFilePath, 'r') as mainFile:
mainFileMmap = mmap.mmap(mainFile.fileno(), 0 , access=mmap.ACCESS_READ)
# open list file and search for words in main file with mmap.find()
with open(listFilePath, 'r') as listFile:
for line in listFile:
line= line.replace("\r", "").replace("\n", "") # remove line-feeds (quick and dirty)
if mainFileMmap.find(line.encode()) == -1:
newWords.append(line)
# append new words to main file
with open(mainFilePath, 'a') as mainFile:
for newWord in set(newWords):
mainFile.write("\n{}".format(newWord))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.