简体   繁体   中英

Remove XML tags with Regex tools

Here is a snippet of my XML file

<layoutItems>
            <behavior>Edit</behavior>
            <field>ID</field>
</layoutItems>
<layoutItems>
            <page>lastViewedAccount</page>
            <showLabel>false</showLabel>
            <showScrollbars>false</showScrollbars>
            <width>100%</width>
</layoutItems>
<layoutItems>
            <behavior>Required</behavior>
            <field>Name</field>
</layoutItems>

I want to remove the section in the middle ie

<layoutItems>
            <page>lastViewedAccount</page>
            <showLabel>false</showLabel>
            <showScrollbars>false</showScrollbars>
            <width>100%</width>
</layoutItems>

This section can appear anywhere inside the file along with other tags.

What is the best way of using some string manipulation tool to remove this? I have been been trying my luck with sed but no success. Any help would be appreciated.

Please note: you should provide as much information as you can. Speaking generally parsing of , , and so on with it not a good idea, use always a - and -tool! The following code may help you in the mean time. And so please also note: it may FAIL with other files and other structures! Do not use in production! I assume NO warranty!

sed -r '/<layoutItems>/{:ka;N;s#(</layoutItems>)#\1#;Tka;s/lastViewedAccount//;T;d}' file 

Inputfile with 2 lastViewedAccount tags:

    <?xml version="1.0" encoding="UTF-8"?>
    <Layout xmlns="http://test.com/2006/04/metadata">
        <emailDefault>false</emailDefault>
        <headers>PersonalTagging</headers>
        <headers>PublicTagging</headers>
        <layoutSections>
            <customLabel>false</customLabel>
            <detailHeading>false</detailHeading>
            <editHeading>true</editHeading>
            <label>Account Information</label>
            <layoutColumns>
                <layoutItems>
                    <page>lastViewedAccount</page>
                    <showLabel>false</showLabel>
                    <showScrollbars>false</showScrollbars>
                    <width>100%</width>
                </layoutItems>
                <layoutItems>
                    <behavior>Edit</behavior>
                    <field>OwnerId</field>
                </layoutItems>
                <layoutItems>
                    <behavior>Required</behavior>
                    <field>Name</field>
                </layoutItems>
                <layoutItems>
                    <behavior>Edit</behavior>
                    <field>ParentId</field>
                </layoutItems>
                <layoutItems>
                    <behavior>Edit</behavior>
                    <field>AccountNumber</field>
                </layoutItems>
                <layoutItems>
                    <page>lastViewedAccount</page>
                    <showLabel>false</showLabel>
                    <showScrollbars>false</showScrollbars>
                    <width>100%</width>
                </layoutItems>
                <layoutItems>
                    <behavior>Edit</behavior>
                    <field>Site</field>
                </layoutItems>
            </layoutColumns>
      </layoutSections>
    </Layout>

Outputfile, lastViewedAccount tags removed:

    <?xml version="1.0" encoding="UTF-8"?>
    <Layout xmlns="http://test.com/2006/04/metadata">
        <emailDefault>false</emailDefault>
        <headers>PersonalTagging</headers>
        <headers>PublicTagging</headers>
        <layoutSections>
            <customLabel>false</customLabel>
            <detailHeading>false</detailHeading>
            <editHeading>true</editHeading>
            <label>Account Information</label>
            <layoutColumns>
                <layoutItems>
                    <behavior>Edit</behavior>
                    <field>OwnerId</field>
                </layoutItems>
                <layoutItems>
                    <behavior>Required</behavior>
                    <field>Name</field>
                </layoutItems>
                <layoutItems>
                    <behavior>Edit</behavior>
                    <field>ParentId</field>
                </layoutItems>
                <layoutItems>
                    <behavior>Edit</behavior>
                    <field>AccountNumber</field>
                </layoutItems>
                <layoutItems>
                    <behavior>Edit</behavior>
                    <field>Site</field>
                </layoutItems>
            </layoutColumns>
      </layoutSections>
    </Layout>

GNU :

sed -nr 'H; \#</layoutItems>#{x;s/(lastViewedAccount)/\1/;Tk;p;:k;x;s/.*//;x;s///;x;d}' file 

$sed -nr 'H; \#</layoutItems>#{x;s/(lastViewedAccount)/\1/;Tk;p;:k;x;s/.*//;x;s///;x;d}' file

    <layoutItems>
            <page>lastViewedAccount</page>
            <showLabel>false</showLabel>
            <showScrollbars>false</showScrollbars>
            <width>100%</width>
    </layoutItems>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM