簡體   English   中英

在另一個主文本文件中搜索每個文本文件中的單詞,如果使用python在主文件中找不到,則追加

[英]search each text file words in another main text file and append if not found in main file using python

在以下情況下,我需要有關python代碼的幫助。

我有兩個文本文件。 一個主文件和一個列表文件。 主文件包含許多單詞,當我從列表文件中找到新單詞時需要更新。

我需要在主文件中搜索列表文件的每個單詞。 如果在主文件中找不到任何單詞,那么我需要在主文件中附加該新單詞。

我有代碼,如果找不到字符串,它將更新文件。 但是,我需要從文本文件中搜索每個單詞。

Main_File = "file path"
list_file="file path"

with open("Main_File", "r+") as file:
for line in file:
    if needle in line:
       break
else: # not found, we are at the eof
    file.write(needle) # append missing data
#this code will append if specific word not found in file.. but,i need to search each word from another file.

如果可以將主文件中的單詞加載到內存中,則可以加載set中的單詞,並檢查該單詞是否在主文件中,如下面的sudo代碼所示

main_file_words = set("load words from your main file".split())

list_file = # read list file
for word in list_file:
    if word not in main_file_words:
        main_file_words.add(word)
        list_file.write(word)

您可以使用mmap加載mainFile並從列表文件中搜索單詞,如下所示:

import mmap

mainFilePath= "mainFile.txt"
listFilePath= "listFile.txt"
newWords=[]

# open main file with mmap
with open(mainFilePath, 'r') as mainFile:
    mainFileMmap = mmap.mmap(mainFile.fileno(), 0 , access=mmap.ACCESS_READ)

    # open list file and search for words in main file with mmap.find()
    with open(listFilePath, 'r') as listFile:
        for line in listFile:
            line= line.replace("\r", "").replace("\n", "") # remove line-feeds (quick and dirty)
            if mainFileMmap.find(line.encode()) == -1:
                newWords.append(line)

# append new words to main file
with open(mainFilePath, 'a') as mainFile:
    for newWord in set(newWords):
        mainFile.write("\n{}".format(newWord))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM