Python-最有效的方式来覆盖CSV文件中的特定行

Question

Given the following csv file : 给定以下csv文件：

01;blue;brown;black
02;glass;rock;paper
03;pigeon;squirel;shark

My goal is to replace the (unique) line containing '02' in the 1st posisition. 我的目标是在第一个病房中替换包含“ 02”的（唯一）行。

I wrote this piece of code: 我写了这段代码：

with open("csv", 'r+', newline='', encoding='utf-8') as csvfile, open('csvout', 'w', newline='', encoding='utf-8') as out:
    reader = csv.reader(csvfile, delimiter=';')
    writer = csv.writer(out, delimiter=';')
    for row in reader:
        if row[0] != '02':
            writer.writerow(row)
        else:
            writer.writerow(['02', 'A', 'B', 'C'])

But re-writing the whole CSV in an other doesn't seem to be the most efficient way to proceed, especially for large files: 但是，以另一种方式重写整个CSV似乎并不是最有效的处理方式，尤其是对于大文件：

Once the match is found, we continue to read till the end. 找到匹配项后，我们将继续阅读直至结束。
We have to re-write every line one by one. 我们必须一步一步地重写每一行。
Writing a second file isn't very practical nor is storage efficient. 编写第二个文件不是很实用，存储效率也不高。

I wrote a second piece of code who seems to answer to these two problems : 我写了第二段代码，似乎可以回答这两个问题：

with open("csv", 'r+', newline='', encoding='utf-8') as csvfile:
    content = csvfile.readlines()
    for index, row in enumerate(content):
        row = row.split(';')
        if row[2] == 'rock':
            tochange = index
            break
    content.pop(tochange)
    content.insert(tochange, '02;A;B;C\n')
    content = "".join(content)
    csvfile.seek(0)
    csvfile.truncate(0)     # Erase content
    csvfile.write(content)

Do you agree that the second solution is more efficient ? 您是否同意第二种解决方案更有效？ Do you have any improvement, or better way to proceed ? 您有什么改进或更好的方法吗？

EDIT : The number of character in the line can vary. 编辑：该行中的字符数可以变化。

EDIT 2 : I'm apparently obliged to read and rewrite everything, if I don't want to use padding. 编辑2 ：如果我不想使用填充，显然我必须阅读和重写所有内容。 A possible solution would be a database-like solution, I will consider it for the future. 可能的解决方案是类似数据库的解决方案，我将在以后考虑。

If I had to choose between those 2 solutions, which one would be the best performance-wise ? 如果我必须在这两种解决方案之间进行选择，哪种才是最佳性能选择？

Answer 1

As the caracter in the line may vary, I either have to read/write the whole file or; 由于该行中的角色可能有所不同，我要么必须读/写整个文件，要么； as @tobias_k said, use seek() to come back to the begining of the line and: 正如@tobias_k所说，使用seek（）返回到行的开头，并：

If the line is shorter, write just the line and pad with spaces; 如果行较短，则仅在行和填充处写上空格；
If same length, write just the line; 如果长度相同，则只写一行；
If it's longer re-write that line and the following. 如果更长，请重新编写该行以及以下内容。

I want to avoid using padding so I used time.perf_counter() to measure exec time of both codes, and the second solution appears to be (almost 2*) faster (CSV of 10 000 lines, match at the 6 000th). 我想避免使用填充，因此我使用time.perf_counter（）来测量两个代码的执行时间， 第二种解决方案似乎快了 （ 几乎2 *） （10,000行的CSV，在第6000行匹配）。

One alternative would be to migrate to a relational database . 一种选择是迁移到关系数据库 。

Python-最有效的方式来覆盖CSV文件中的特定行

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-02-28 14:32:28

Python-最有效的方式来覆盖CSV文件中的特定行

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-02-28 14:32:28

解决方案1
1 已采纳 2019-02-28 14:32:28