简体   繁体   English

编辑大型文本文件中的单行

[英]Editing a single line in a large text file

So I need to record a set of 4 integers whose values are different for every second of the day. 所以我需要记录一组4个整数,它们的值在每天的每一秒都不同。 ie: 即:

#Here the values are initialized to the same value, however they will change as samples are taken
data = [[.25 for numData in range(4)] for numSecs in range(86400)]

Now obviously a two dimensional array(gah its python, LIST) whose first index length is 86400 is quite impractical. 现在显然是一个二维数组(gah它的python,LIST),其第一个索引长度是86400是非常不切实际的。 Instead I want to create a text file with 86400 lines formatted as such: 相反,我想创建一个文本文件,其格式为86400行:

numSec data0 data1 data2 data3

0 .25 .25 .25 .25
1 .25 .25 .25 .25
2 .25 .25 .25 .25
...

And as samples are taken, I want to be able to edit this file, nay, I want to be able to edit the line of the file whose numSec = the second the sample was taken. 并且在采集样本时,我希望能够编辑此文件,不,我希望能够编辑numSec =采样的第二个文件的行。 For example, a sample taken at numSec = 2 (2 seconds after midnight) would cause my program to edit the file so that: 例如,以numSec = 2(午夜后2秒)拍摄的样本将导致我的程序编辑该文件,以便:

0 .25 .25 .25 .25
1 .25 .25 .25 .25
2 .70 .10 .10 .10
...

Seems easy enough, I have even read a bunch of posts which demonstrated how to rewrite a single in a text file. 看起来很简单,我甚至阅读了一些帖子,演示了如何在文本文件中重写单个。 The problem is, they all require that you read in the whole file. 问题是,它们都要求您读入整个文件。 I don't want my program to be reading 86,400 lines every second. 我不希望我的程序每秒读取86,400行。

Thus we arrive at my question: Can I read a single line in a text file, edit it, and write it back to the file, without reading the entire file every single time a change needs to be made? 因此,我们得出了一个问题:我可以在文本文件中读取一行,编辑它,然后将其写回文件,而无需每次都需要进行更改时读取整个文件吗?

PS I should note I am running Ubuntu 12.04 (Precise) and this is for use in a ROS node PS我应该注意我正在运行Ubuntu 12.04(精确),这是用于ROS节点

PPS This program will be running for an arbitrary amount of days so each "second" of data could be read and rewritten many times. PPS该程序将运行任意数天,因此可以多次读取和重写每个“第二”数据。 Also another reason I would like to use a file is if the system needs to be shut off, I would like to save the distributions for the next time it is run. 我想使用文件的另一个原因是,如果系统需要关闭,我想保存下次运行时的分发。

You may need to modify this a bit and it assume that all lines are of the same length. 您可能需要对此进行一些修改,并假设所有行都具有相同的长度。 For this, I had to pad the first column to a fixed width. 为此,我必须将第一列填充到固定宽度。 But if you don't want padding you should be able to calculate the number of 1,2,3,4,.. digit numbers before a particular row. 但是如果你不想填充,你应该能够计算特定行之前的1,2,3,4,...数字的数量。

data = [[.25 for numData in range(4)] for numSecs in range(86400)]
length_line=0

def write_line(f, sec, data):
    line="{:6d}".format(sec) + " " +" ".join(
            ["{:.2f}".format(dd) for dd in data])+"\n"
    f.write(line)
    return len(line)

with open('output', 'w') as of:
    for i,d in enumerate(data):
        length_line=write_line(of, i, d)

with open('output', 'rb+') as of:
    # modify the 2nd line:
    n = 2
    of.seek(n*length_line)
    write_line(of, n, [0.10,0.10,0.10,0.10])
    # modify the 10th line:
    n = 10
    of.seek(n*length_line)
    write_line(of, n, [0.10,0.10,0.10,0.10])

If the lines are of different lengths, then everything after the modified line will be in the wrong position and you have to rewrite all those lines. 如果线条的长度不同,那么修改后的线条后面的所有内容都将处于错误的位置,您必须重写所有这些线条。 If the lines all have the same length, then you can seek() and write() the new data by calculating the line's offset in the file. 如果所有行都具有相同的长度,则可以通过计算文件中行的偏移量来seek()write()新数据。 See Python File Objects for more info. 有关详细信息,请参阅Python文件对象

I am not sure if it useful to store 345600 (86400 * 4) times a 0.25. 我不确定将345600(86400 * 4)次存储为0.25是否有用。 Just store the default in the first line. 只需将默认值存储在第一行。 Than append one line at a time. 而不是一次附加一行。 If the time stamps do not come in order just put them in as they are and after the day is over sort the file content once and fill the missing time stamps with the default. 如果时间戳不按顺序放入,则在一天结束后对文件内容进行一次排序并用默认值填充缺失的时间戳。 Example: 例:

default: 0.25
2 .70 .10 .10 .10
3 .80 .20 .20 .20
1 .50 .30 .30 .30
5 .40 .30 .30 .30

Later process this file to get this: 稍后处理此文件以获取此信息:

1 .50 .30 .30 .30
2 .70 .10 .10 .10
3 .80 .20 .20 .20
4 .25 .25 .25 .25
5 .40 .30 .30 .30

If I were you, I would use a sqlite database to store records. 如果我是你,我会使用sqlite数据库来存储记录。 The key would be the second of the observation and each row would consist of 4 values. 关键是观察的第二个,每行由4个值组成。 Updating and editing would be much more efficient. 更新和编辑会更有效率。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM