简体   繁体   English

Python搜索并替换文件

[英]Python search and replace in a file

I was trying to make a script to allow me to automate clean ups in the linux kernel a little bit. 我试图制作一个脚本,使我可以自动进行Linux内核中的清理。 The first thing on my agenda was to remove braces({}) on if statements(c-styled) that wasnt necessary for single statement blocks. 我议程上的第一件事是删除单语句块不需要的if语句(c样式)上的花括号({})。 Now the code I tried with my little knowledge of regex in python I got to a working state, such as: 现在,我用对regex的一点了解尝试的代码就进入了工作状态,例如:

if (!buf || !buf_len) {
        TRACE_RET(chip, STATUS_FAIL);
        }

and the script turn it into: 然后脚本将其转换为:

if (!buf || !buf_len) 
        TRACE_RET(chip, STATUS_FAIL);

Thats what I want but when I try it on real source files it seems like it randomly selects a if statement and take its deleted it beginning brace and it has multiple statement blocks and it remove the ending brace far down the program usually on a else satement or a long if statement. 那就是我想要的,但是当我在真实的源文件上尝试时,似乎它随机选择一个if语句并删除它的开始大括号,并且它具有多个语句块,并且通常在else平台上将结束大括号删除到程序下方或较长的if语句。

So can someone please help me with make the script only touch an if statement if it has a single block statement and correctly delete it corresponding beginning and ending brace. 因此有人可以帮助我使脚本仅在其具有单个block语句的情况下才触摸if语句,并正确删除其对应的开始和结束括号。

The correct script looks like: 正确的脚本如下所示:

from sys import argv
import os
import sys
import re

get_filename = argv[1]
target = open(get_filename)
rename = get_filename + '.tmp'
temp = open(rename, 'w')

def if_statement():
    look=target.read()
    pattern=r'''if (\([^.)]*\)) (\{)(\n)([^>]+)(\})'''
    replacement=r'''if \1 \3\4'''
    pattern_obj = re.compile(pattern, re.MULTILINE)
    outtext = re.sub(pattern_obj, replacement, look)
    temp.write(outtext)
    temp.close()
    target.close()


if_statement()

Thanks in advance 提前致谢

In theory, this would mostly work: 从理论上讲,这通常会起作用:

re.sub(r'(if\s*\([^{]+\)\s*){([^;]*;)\s*}', r'\1\2', yourstring)

Note that this will fail on nested single-statement blocks and on semicolons inside string or character literals. 请注意,这将在嵌套的单语句块和字符串或字符文字内部的分号时失败。

In general, trying to parse C code with regex is a bad idea, and you really shouldn't get rid of those braces anyway. 通常,尝试使用正则表达式解析C代码是一个坏主意,无论如何,您真的不应该摆脱这些花括号。 It's good practice to have them and they're not hurting anything. 拥有它们是一个好习惯,它们不会伤害任何东西。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM