简体   繁体   中英

Working with sed linux command

In my shellscript code I saw that there is line that is handling Telephone number using sed command.

sed "s~<Telephone type[ ]*=[ ]*\"fax\"[ ]*><Number>none[ ]*</Number></Telephone>~~g" input.xml > output.xml

I am not understanding what the regular expression actually does.

<Telephone type[ ]*=[ ]*\"fax\"[ ]*><Number>none[ ]*</Number></Telephone>

I am doing revere engineering to get this working.

My xml structure like below.

<ContactMethod>
    <InternetEmailAddress>donald.francis@lexisnexis.com</InternetEmailAddress>
    <Telephone type = "work">
        <Number>215-639-9000 x3281</Number>
    </Telephone>
    <Telephone type = "home">
        <Number>484-231-1141</Number>
    </Telephone>
    <Telephone type = "fax">
        <Number>N/A</Number>
    </Telephone>
    <Telephone type = "work">
        <Number>215-639-9000 x3281</Number>
    </Telephone>
    <Telephone type = "home">
        <Number>484-231-1141</Number>
    </Telephone>
    <Telephone type = "fax">
        <Number>none</Number>
    </Telephone>
    <Telephone type1 = "fax12234">
        <Number>484-231-1141sadsadasdasdaasd</Number>
    </Telephone>
</ContactMethod>

That regex recognises <Telephone type = "fax"> entries where the number is given as none , and deletes them.

Breakdown:

s sed command for "substitution".

~ pattern separator. You can choose any character for this. sed recoginizes it because it comes right after the s .

<Telephone type This matches the literal text "<Telephone type".

[ ]* matches zero or more spaces.

= matches a literal "="

[ ]* matches zero or more spaces.

\\"fax\\" matches literal text. The quotes are escaped because the whole pattern appears inside quotes, but the shell removes the quote characters ( \\ ) before sed sees them.

[ ]* matches zero or more spaces.

><Number>none matches literal text.

[ ]* matches zero or more spaces.

</Number></Telephone> matches the literal text.

~~ the pattern separators end the search pattern, and surround an empty replace pattern.

g is a flag that means the substitution will be performed multiple times on each line.

The only thing that confuses me is that this pattern won't match anything that has line breaks in it, so I presume your input.xml isn't actually formatted like you have in your example data?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM