简体   繁体   English

使用 sed 或通过匹配多行来编辑 xml 文件

[英]edit xml file using sed or something by matching over multiple lines

I'd like to 'edit' some xml files, which may have similar sections multiple times in one file.我想“编辑”一些 xml 文件,这些文件可能在一个文件中多次包含类似的部分。 I need add 2 possible missing lines (i call it a pair) inside each section.. ie check if the 'pair' exists or not, if it does not then add them.我需要在每个部分中添加 2 条可能的缺失行(我称之为一对)。即检查“对”是否存在,如果不存在则添加它们。

eg below is the possible missing pair lines i'd like to add in.例如,下面是我想添加的可能缺少的对行。

<arg>--possibleMissedKey</arg>
<arg>possibleMissedValue</arg>

Below file has the pair, so i do not need to add them in but if any of the section did miss this pair, i'd like to add the pair in to the section.. Also the number of lines in each section is not predictable.下面的文件有这对,所以我不需要添加它们,但如果任何部分确实错过了这对,我想将这对添加到该部分中。每个部分中的行数也不是可预见。


    <some-tag-section-not-interesting>
        some contents not interesting to me
    </some-tag-section-not-interesting>
    <some-tag-to-look-for>
        <some stuff - a> ..... </some stuff - a>
        <arg>--possibleMissedKey</arg>
        <arg>possibleMissedValue</arg>
        <something-else-not-interesting>blahblah</something-else-not-interesting>
    </some-tag-to-look-for>
    <some-tag-to-look-for>
        <some stuff - b>....</some stuff - b>
        <arg>--possibleMissedKey</arg>
        <arg>possibleMissedValue</arg>
        <something-else-not-interesting>blahblah</something-else-not-interesting>
    </some-tag-to-look-for>

so i've consider a few options, but each one i've a question for it:所以我考虑了几个选项,但每个选项我都有一个问题:

  • the first thing came to my mind is 'sed'.我首先想到的是'sed'。 I am hoping to replace the ending tag </some-tag-to-look-for> with我希望将结束标记</some-tag-to-look-for>替换为

        <arg>--possibleMissedKey</arg>
        <arg>possibleMissedValue</arg>
     </some-tag-to-look-for> 

ie essentially add it to the last part of the section.即基本上将其添加到本节的最后一部分。 but i am not aware of whether I can 'pattern match on multiple lines' in sed .但我不知道我是否可以在 sed 中“多行模式匹配” and I have not used so called 'hold space'.而且我没有使用所谓的“保持空间”。 My experience with sed has been to check some string in the 'current single line'.我对 sed 的经验是检查“当前单行”中的一些字符串。

  • Another option i was hoping to check, is to introduce a inside_a_section_flag , initial value is '0'.我希望检查的另一个选项是引入inside_a_section_flag ,初始值为“0”。 start reading the file, the moment I find a staring <some-tag-to-look-for> , I turn that inside_a_section_flag to '1', and once I reach the 'ending </some-tag-to-look-for> ', I do some possible changes, and turn it back to '0'.开始阅读文件,当我发现一个凝视的<some-tag-to-look-for>时,我将inside_a_section_flag为'1',一旦我到达'结尾</some-tag-to-look-for> ',我做了一些可能的改变,然后把它变回'0'。 so this inside_a_section_flag , if it is 1, means I are inside of the section.... so I need to look for the 'pair', if I found the pair, I turn the inside_a_section_flag to be 1, meaning I do not need to add the pair and can get out of the current section.... but I am not aware of whether sed can also work with a variable flag, ie do conditional replacement/change based on a variable value .所以这个inside_a_section_flag ,如果它是 1,意味着我在这个部分里面......所以我需要寻找“pair”,如果我找到了那对,我将inside_a_section_flag设置为 1,这意味着我不需要添加这对并可以退出当前部分....但我不知道sed 是否也可以使用变量标志,即根据变量值进行条件替换/更改

  • Should this be done by shell at all, instead should this be done by a python script instead?这是否应该由 shell 完成,而不应该由 python 脚本完成?

This might work for you (GNU sed):这可能对您有用(GNU sed):

sed '/<some-tag-to-look-for>/{:a;n;/<arg>--possibleMissedKey<\/arg>/b;/<\/some-tag-to-look-for>/!{h;ba};x;s/\S.*/<arg>--possibleMissedKey<\/arg>/p;s//<arg>--possibleMissedValue<\/arg>/p;x}' file

Match on a line containing <some-tag-to-look-for> .匹配包含<some-tag-to-look-for>行。

Loop through the following lines.循环遍历以下行。

If a line containing <arg>--possibleMissedKey</arg> is encountered, bail out.如果遇到包含<arg>--possibleMissedKey</arg>的行,则退出。

Otherwise, if the current line does not match </some-tag-to-look-for> , make a copy and repeat.否则,如果当前行不匹配</some-tag-to-look-for> ,请复制并重复。

When the end tag is found, insert the required two lines using the copied line as a template (so as to retain indentation).当找到结束标记时,以复制的行为模板插入所需的两行(以保留缩进)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM