简体   繁体   English

如何在循环中修改文本文件中的特定行?

[英]how to modify specific lines in a text file in a loop?

I am using python 2.7 (OS-centos6) 我正在使用python 2.7(OS-centos6)

I have a text file. 我有一个文本文件。 For example, it consists of these lines: 例如,它包含以下几行:

0     4.064  16.786   7.016    0
1     5.520  14.733   5.719    0
2     5.904  17.898   5.222    0
3     3.113  18.613  18.453    0
4     3.629  16.760   5.118    0
            :
            :
            :
398   6.369  14.623    6.624    0
399   5.761  18.084    7.212    0
400   2.436  17.021   10.641    0

Last column contains all 0's initially. 最后一列最初包含全0。 It is a flag basically. 基本上是旗帜。 I want to modify this text file ie I want to make last column entry to be 1 (ie change the flag value to 1) whenever some criterion is matched for the specific line. 我想修改此文本文件,即, 每当特定条件与特定行匹配时,我都希望将最后一列条目设置为1(即,将标志值更改为1)。 For example,line numbers 3,20,250,400 satisfies this criterion. 例如,行号3,20,250,400满足此条件。 Then I want to make flag value (last column entries) of these specific lines to be 1 without changing other values present on these lines. 然后,我想使这些特定行的标志值(最后一列条目)为1,而不更改这些行上存在的其他值。 Also, I want to do this in loop since I have many criteria. 另外,由于我有很多条件,因此我想循环执行此操作。 Therefore I have to go to top of the file everytime (ie for every criterion) and scan it from top to bottom; 因此,我必须每次都转到文件顶部(即针对每个条件),并从上到下进行扫描; whenever criterion is satisfied, mark the specific line's flag as 1. 只要满足条件,就将特定行的标志标记为1。

Important: I am using same modified file then to select only those lines (for further processing) whose flag value is NOT 1. For each iteration of the loop mentioned above, I want to read this modified file. 重要说明:我正在使用相同的修改文件,然后仅选择标志值不为1的那些行(用于进一步处理)。对于上述循环的每次迭代,我都想读取此修改文件。 This means, in short, I want to modify file (ie set flag to 1) for one criterion --> then read the modified file --> do processing --> then take next criterion --> set the flag to 1 for this criterion --> read the modified file--> and so on. 简而言之,这意味着我想为一个标准修改文件(即,将标志设置为1)->然后读取修改后的文件->做处理->然后采用下一个标准->为以下条件将标志设置为1此标准->读取修改后的文件->依此类推。

I would like to add this: The criterion to be satisfied takes into account two different lines everytime. 我想补充一点:要满足的标准每次都要考虑两条不同的线。 eg If the difference between 2nd column entries for 3rd & 398th lines is less than 2.0, then set flag of 398th line to 1. ie difference 17.898 - 18.084 is less than 2.0, so flag of 398th line will be set to 1 例如,如果第三行和第398行的第二列条目之间的差小于2.0,则将第398行的标志设置为1。即,差17.898-18.084小于2.0,因此将第398行的标志设置为1

Any help will be highly appreciated. 任何帮助将不胜感激。

Okay. 好的。 First you'll want to open the file and read each line. 首先,您需要打开文件并阅读每一行。

I'd recommend reading the file line by line from one file and writing it to a second file. 我建议从一个文件逐行读取文件,然后将其写入第二个文件。

with open("original.dat", "r"), open("new.dat", "w") as source, destination:
    for line in source:
        # split on spaces is the default:
        line_no, v1, v2, v3, flag = line.split()
        # just an example, do whatever checks you need to
        should_set_flag = some_computation(v1, v2, v3)
        if should_set_flag: 
            flag = 1
        destination.write("{} {} {} {} {}\n".format(line_no, v1, v2, v3, flag))

Perhaps I'm not understanding your requirement of reading the whole file each time you make one change. 也许我不理解您每次进行一次更改时都读取整个文件的要求。 Given that the lines seem to be independent of one another I'm not sure why that's at all necessary. 鉴于这些线似乎彼此独立,所以我不确定为什么这样做是完全必要的。

    f=open("filename",'r')
    data=f.readlines()
    f.close()
    #remove file by using os.rm or using subprocess
    i=0
    while i < len(data):
          #do something
          #make changes to data list
    f=open("filename",'w')
    f.write(data)

That is the only way probably.Load data,remove old file,make changes,write to a new file. 那可能是唯一的方法。加载数据,删除旧文件,进行更改,写入新文件。

why do you need to write the file back? 为什么需要将文件写回? it's only 400 lines, you can keep the lines in memory and to the processing one by one: 它只有400行,您可以将这些行保留在内存中并一一处理:

def is_criterion_1_fulfilled(row):
    return row[1]<4 # only an example

def process_1(row):
    print row # or do anything else with the line

def filter_and_process(iterator, criterion, process):
    for row in iterator:
        if criterion(row):
            continue
        process(row)
        yield row

def main():
    with open(filename, 'r') as inp:
        dataset = [map(float, line.split()) for line in inp]
    dataset = list(filter_and_process(dataset, is_criterion_1_fulfilled, process_1))
    dataset = list(filter_and_process(dataset, is_criterion_2_fulfilled, process_2))
    ....

if __name__ == '__main__':
    main()
# Imports
import re

# Functions
def check_data(record, records):
    # TODO Implement check operation
    return False

# Read input data
infile = "data.txt"
with open(infile, "r") as f:
    # Make a list of lists
    records = [re.split('\s+',record) for record in f.read().splitlines()]

# Process the data
for i, record in enumerate(records):
    # enumerate so as to refer to ith record if necessary,
    # but lineno anyway available in record[0]
    if check_data(record, records):
        record[4] = '1'


# Write modified data
outfile = "out%s" % infile
with open(outfile, "w") as f:
    for record in records:
        f.write('\t'.join(record)+'\n')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM