繁体   English   中英

从文本文件删除多行

[英]Deleting Multiple Lines From a Text File

因此,我对此进行了一些研究,到目前为止,我发现我需要逐行将文件读取到内存中,因为此文件最终会变得很大,请检查不需要的字符串并继续阅读/从那里写。

我的程序按日期搜索文本文件,读取日期下方的行,并在到达“结束”时停止。 我需要能够从日期到“结束”删除一个表,并用存储在字典中的相同格式的另一个表替换它。

这是我到目前为止所拥有的。

这是文本文件:

05/11/18
test1 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A
test2 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 
test3 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 
test4 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 
end

06/11/18
test1 N/A N/A 09:30 18:00 09:30 18:00 09:30 18:00 09:30 18:00
test2 08:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00
test3 09:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00
test4 10:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00
end

这是带有新表的字典:

  {'test1': ['N/A', 'N/A', '09:30', '18:00', '09:30', '18:00', '09:30', '18:00', '09:30', '18:00'], 
'test2': ['08:30', '18:00', '10:30', '18:00', '10:30', '18:00', '10:30', '18:00', '10:30', '18:00'], 
'test3': ['09:30', '18:00', '07:30', '18:00', '07:30', '18:00', '07:30', '18:00', '07:30', '18:00'], 
'test4': ['10:30', '18:00', '08:30', '18:00', '08:30', '18:00', '08:30', '18:00', '08:30', '18:00']}

顺便说一句,我试图用日期05/11/18代替表格。

这是用于读取文件中的每一行并查找以日期开头的行的代码。

received="05/11/18"
with open("StaffTimes.txt","r+") as file:
    new_f=file.readlines()
    file.seek(0) #Puts pointer to start of file
    for line in new_f: #For every line in the file

        if received not in line: #If the date is not in the line
            file.write(line) #Re-write the line into the file

        if received in line:
            while True:
                nextLine=next(file, "").strip() #Stores the next line in nextLine
                if nextLine=="end": #Loops until end is found
                    next(file, "") #Now pointer is at line after end
                    break

这是将字典写回文本的代码。 (这不是问题,仅提供上下文即可)。

file.write(received)
    file.write("\n")
    usernameList=["test1", "test2", "test3", "test4"] #This will be received from client
    for username in usernameList:
        file.write(username)
        file.write(" ")
        workTimes=times.get(username)
        for time in workTimes:
            file.write(time)
            file.write(" ")
        file.write("\n")
    file.write("end")
    file.write("\n")
    file.write("\n")

总的来说,我的问题是我似乎只能删除日期,而不能删除日期。 无论如何,它也只是重写整个内容,包括带有和不带有日期的新表。

重写后,我需要文本文件看起来像这样:

05/11/18
test1 N/A N/A 09:30 18:00 09:30 18:00 09:30 18:00 09:30 18:00 
test2 08:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 
test3 09:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 
test4 10:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 
end

06/11/18
test1 N/A N/A 09:30 18:00 09:30 18:00 09:30 18:00 09:30 18:00
test2 08:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00
test3 09:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00
test4 10:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00
end

也许更好的方法是:

old_file = open("/path/to/old_file.txt", "r")
new_file = open("/path/to/new_file.txt", "w")

for line in old_file:
    if received in line:
        write_replacement_lines_to_new_file()

        # now skip lines in old_file until we get to the "end" marker.
        # the "for" loop will continue reading from the current
        # position in old_file
        for line in old_file: 
            if "end" in line:
                break

    else:
        new_file.write(line)

old_file.close()
new_file.close()

然后最后,只需将new_file复制到old_file上(也许使用os.rename()

首先创建一个简单的迭代器,为您提供每个块

def iter_dates_in_file(filehandle_in):
  for line in filehandle_in:
     if re.match("\d{1,2}/\d{1,2}/\d{2,4}",line.strip()):
        matched = [line]
        while not matched[-1].strip() == "end":
          matched.append(next(filehandle_in))
        yield ''.join(matched)

然后您可以测试每个块

with open(infile_name,"r") as in_file, open('output.txt','w') as f_out:
   for chunk in iter_dates_in_file(in_file):
       if test_if_i_should_save(chunk):
          f_out.write(chunk)
  • 您阅读文件,直到找到所需的日期
    • 直到您将所有行(包括带有查找日期的行)复制到第二个文件。
  • 如果您按了日期,请跳过所有行,直到找到下一个结尾(使用布尔值)。
  • 将新数据和“结束”写入新文件,重置布尔值并继续

创建测试文件:

t = """ Some other data

05/11/18 
test1 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 
test2 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A  
test3 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A  
test4 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A  
end

06/11/18 
test1 N/A N/A 09:30 18:00 09:30 18:00 09:30 18:00 09:30 18:00 
test2 08:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 
test3 09:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 
test4 10:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 
end"""

with open ("file.txt","w") as f: 
    f.write(t)

读取和写入新文件的代码:

look_for ="05/11/18"     
data = { 'test1': ['N/A', 'N/A', '09:30', '18:00', '09:30', '18:00', '09:30', '18:00', '09:30', '18:00'],  
'test2': ['08:30', '18:00', '10:30', '18:00', '10:30', '18:00', '10:30', '18:00', '10:30', '18:00'],  
'test3': ['09:30', '18:00', '07:30', '18:00', '07:30', '18:00', '07:30', '18:00', '07:30', '18:00'],  
'test4': ['10:30', '18:00', '08:30', '18:00', '08:30', '18:00', '08:30', '18:00', '08:30', '18:00']}

with open("file.txt","r") as f, open("file_2.txt","w") as f_new:
    # remember if we found it
    found_it = False
    for line in f.readlines():

        # handles the case we are currently in the region we need to skip lines till end
        if found_it:
            if line.startswith("end"):
                found_it = False

                # write replacement data and add end
                for k in data:
                    f_new.write(' '.join( [k] + data[k] +["\n"] ) )
                f_new.write(line) # add the end

            else:

                # found it but still reading its data: 
                # skip line from output
                continue

        # not in the critical region, just transfer lines
        if not line.startswith( look_for ):
            f_new.write(line)
            continue
        else:
            found_it = True
            f_new.write(line) # still need the date

测试代码:

with  open("file_2.txt","r") as f:
    print(f.read())

输出:

Some other data

05/11/18
test1 N/A N/A 09:30 18:00 09:30 18:00 09:30 18:00 09:30 18:00 
test2 08:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 
test3 09:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 
test4 10:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 
end
end

06/11/18
test1 N/A N/A 09:30 18:00 09:30 18:00 09:30 18:00 09:30 18:00
test2 08:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00 10:30 18:00
test3 09:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00 07:30 18:00
test4 10:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00 08:30 18:00
end

将新文件重命名为旧文件并玩得开心。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM