简体   繁体   中英

Editing a single line in a large text file

So I need to record a set of 4 integers whose values are different for every second of the day. ie:

#Here the values are initialized to the same value, however they will change as samples are taken
data = [[.25 for numData in range(4)] for numSecs in range(86400)]

Now obviously a two dimensional array(gah its python, LIST) whose first index length is 86400 is quite impractical. Instead I want to create a text file with 86400 lines formatted as such:

numSec data0 data1 data2 data3

0 .25 .25 .25 .25
1 .25 .25 .25 .25
2 .25 .25 .25 .25
...

And as samples are taken, I want to be able to edit this file, nay, I want to be able to edit the line of the file whose numSec = the second the sample was taken. For example, a sample taken at numSec = 2 (2 seconds after midnight) would cause my program to edit the file so that:

0 .25 .25 .25 .25
1 .25 .25 .25 .25
2 .70 .10 .10 .10
...

Seems easy enough, I have even read a bunch of posts which demonstrated how to rewrite a single in a text file. The problem is, they all require that you read in the whole file. I don't want my program to be reading 86,400 lines every second.

Thus we arrive at my question: Can I read a single line in a text file, edit it, and write it back to the file, without reading the entire file every single time a change needs to be made?

PS I should note I am running Ubuntu 12.04 (Precise) and this is for use in a ROS node

PPS This program will be running for an arbitrary amount of days so each "second" of data could be read and rewritten many times. Also another reason I would like to use a file is if the system needs to be shut off, I would like to save the distributions for the next time it is run.

You may need to modify this a bit and it assume that all lines are of the same length. For this, I had to pad the first column to a fixed width. But if you don't want padding you should be able to calculate the number of 1,2,3,4,.. digit numbers before a particular row.

data = [[.25 for numData in range(4)] for numSecs in range(86400)]
length_line=0

def write_line(f, sec, data):
    line="{:6d}".format(sec) + " " +" ".join(
            ["{:.2f}".format(dd) for dd in data])+"\n"
    f.write(line)
    return len(line)

with open('output', 'w') as of:
    for i,d in enumerate(data):
        length_line=write_line(of, i, d)

with open('output', 'rb+') as of:
    # modify the 2nd line:
    n = 2
    of.seek(n*length_line)
    write_line(of, n, [0.10,0.10,0.10,0.10])
    # modify the 10th line:
    n = 10
    of.seek(n*length_line)
    write_line(of, n, [0.10,0.10,0.10,0.10])

If the lines are of different lengths, then everything after the modified line will be in the wrong position and you have to rewrite all those lines. If the lines all have the same length, then you can seek() and write() the new data by calculating the line's offset in the file. See Python File Objects for more info.

I am not sure if it useful to store 345600 (86400 * 4) times a 0.25. Just store the default in the first line. Than append one line at a time. If the time stamps do not come in order just put them in as they are and after the day is over sort the file content once and fill the missing time stamps with the default. Example:

default: 0.25
2 .70 .10 .10 .10
3 .80 .20 .20 .20
1 .50 .30 .30 .30
5 .40 .30 .30 .30

Later process this file to get this:

1 .50 .30 .30 .30
2 .70 .10 .10 .10
3 .80 .20 .20 .20
4 .25 .25 .25 .25
5 .40 .30 .30 .30

If I were you, I would use a sqlite database to store records. The key would be the second of the observation and each row would consist of 4 values. Updating and editing would be much more efficient.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM