简体   繁体   English

在文件行之前添加文本的更有效,更通用的方法?

[英]More efficient and general way to prepend text before lines of a file?

I am new to python. 我是python的新手。 In one of task I have to add a character before specific lines.For example, in my text file the 在一项任务中,我必须在特定行之前添加一个字符。例如,在我的文本文件中,

Name 名称

and

Surname

are the fixed lines on which I have to either add or delete ; 是我必须添加或删除的固定线路; based on flag 基于标志

hey : see
;Name:blah
;Surname:blah

This is the code I have written for the same... Is it efficient enough? 这是我为相同代码编写的代码……它足够有效吗? Can we write more efficient and can we pass 我们可以提高写作效率吗?

Name and Surname 名和姓

as arguments I mean the keywords as an arguments to the function to which add 作为参数,我的意思是将关键字作为要添加到的函数的参数

; ;

def changefile(filepath,flag):
   # flag = 1
    final = ""
    with open(filepath) as content:
        for line in content:
            if flag==1:
                if line.split(":")[0]==";Name" or line.split(":")[0]==";Surname":
                    final += line[1:]
                else:
                    final += line
            else:
                if line.split(":")[0]=="Name" or line.split(":")[0]=="Surname":
                    final += ";"
                final += line
    f = open(filepath, 'r+')
    f.truncate()
    f.write(final)
    f.close()


changefile("abc.txt",0)

I poked at it a lot, and borrowed martineau's ideas, and ended up with this: 我经常戳它,并借用了martineau的想法,最终得出以下结论:

def change_file(filepath, add_comment, trigger_words):

    def process(line):
        line_word = line.lstrip(';').split(':')[0]

        if line_word in trigger_words:
            if add_comment:
                line = line if line.startswith(';') else ';' + line
            else:
                line = line.lstrip(';')

        return line


    with open(filepath) as f:
        content = [process(line) for line in f]


    with open(filepath, 'r+') as f:
        f.truncate()
        f.write(''.join(content))


change_file('abc.txt', add_comment=True, trigger_words=["sys", "netdev"])

The main "nice" bit (that I like) is using a list comprehension [process(line) for line in f] , because it does away with the whole final = ''; final += blah 主要的“不错”位(我喜欢)使用列表推导[process(line) for line in f] ,因为它消除了整个final = ''; final += blah final = ''; final += blah arrangement. final = ''; final += blah安排。 It processes every line and that's the output. 它处理每一行,这就是输出。

I've changed the flag so instead of reading " flag is 0 or 1 " (what does that mean?) it now reads " add_comment is True or False ", to more clearly indicate what it does. 我已经更改了flag所以现在不再显示“ 标志是0或1 ”(这是什么意思?),而是显示“ add_comment是True还是False ”,以更清楚地指示其作用。

In terms of efficiency, it could be better; 就效率而言,可能会更好; (make "trigger_words" a set so that testing membership was faster, change the way it normalizes every line for testing); (将“ trigger_words”设置为一个集合,以使测试成员资格更快,更改其标准化每一行以进行测试的方式); but if you're processing a small file it won't make much difference Python is fast enough, and if you're processing an enormous file, it's more likely IO limited than CPU limited. 但是,如果您正在处理一个小文件,那么Python的速度就不会足够快,而如果您正在处理一个大文件,则IO受限制的可能性大于CPU受限制的可能性。

Try it online here: https://repl.it/CbTo/0 (it reads, and prints the results, it doesn't try to save). 在此处在线尝试: https : //repl.it/CbTo/0 (它读取并打印结果,但不尝试保存)。

(NB. .lstrip(';') will remove all semicolons at the start of the line, not just one. I'm assuming there's only one). (注意: .lstrip(';')将删除该行开头的所有分号,而不仅仅是一个分号。我假设只有一个)。


Edit from the comments. 从评论中编辑。 Here's a version which will process the SQL Server UTF-16 install file without screwing it up, however I don't know of a general fix for this that will work for all files. 这是一个可以处理SQL Server UTF-16安装文件而不会弄乱它的版本,但是我不知道一个适用于所有文件的常规修复程序。 Note this reads the file as a specific data encoding, and writes a binary file with a specific data encoding. 请注意,这将读取文件作为特定的数据编码,并写入具有特定数据的二进制文件。 And changes the split to = for the SQL ini format. 并将SQL ini格式的拆分更改为= And doesn't truncate because w mode does that. 并且不会truncate因为w模式可以做到这一点。

import codecs

def change_file(filepath, add_comment, trigger_words):

    def process(line):
        line_word = line.lstrip(';').split('=')[0]

        if line_word in trigger_words:
            if add_comment:
                line = line if line.startswith(';') else ';' + line
            else:
                line = line.lstrip(';')

        return line


    with codecs.open(filepath, encoding='utf-16') as f:
        content = [process(line) for line in f]

    with codecs.open(filepath, 'wb', encoding='utf-16') as f:
        f.write(''.join(content))


change_file('d:/t/ConfigurationFile - Copy.ini', add_comment=True, trigger_words=["ACTION", "ENU"])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 解析文本文件不同行的有效方法 - Efficient way to parse different lines of a text file 从大型文件检索行的更有效方法 - More efficient way to retrieve lines from a huge file Python:在大写字母之前在字符串中添加空格的有效方法 - Python: Efficient way to prepend a space in a string before of upper case letter 在每个类别的文件中对文本文件的行进行分组-最有效的方法 - Group lines of a text file in files per category - Most efficient way 在 python 中迭代文本文件的更有效方法? - More efficient way of iterating over a text-file in python? 有没有更有效的方法从大文本文件创建倒排索引? - Is there a more efficient way to create an inverted index from a large text file? 从第一列与另一个文件匹配的文件中提取行的更有效方法 - more efficient way to extract lines from a file whose first column matches another file 有没有更有效的方法将行从大文件追加到numpy数组? - MemoryError - Is there a more efficient way to append lines from a large file to a numpy array? - MemoryError 格式化可选文本字符串的更有效方法 - More efficient way of formatting optional text string 寻找在 python 中按行数拆分大型文本文件的有效方法的想法 - Looking for ideas for efficient way to split large text file by number of lines in python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM