简体   繁体   English

用于在xml标记之间提取字符串的Shell脚本

[英]Shell script to extract strings between xml tags

Can you please help me to extract strings between xml tags. 你能帮我解决一下xml标签之间的字符串吗? xml input: xml输入:

    <Name ns1:translate="yes">Overview</Name>
    <Title ns1:translate="yes">This is a book</Title>
    <Description ns1:translate="yes"/>
    <TextValue ns1:translate="yes">End</TextValue>

Expected output: 预期产量:

    Overview = Overview
    This is a book = This is a book
       =
    End = End

If you want just remove tags, you can do it this way: 如果您只想删除标签,可以这样做:

$ sed 's/<[^>]*>//g'

If you want to repeat the text in tags, you need something like: 如果要重复标记中的文本,则需要以下内容:

$ sed 's/.*>\([^<]*\)<.*/\1 = \1/g'

One suggestion: Please use PERL for XML read/extraction. 一个建议:请使用PERL进行XML读取/提取。 PERL has many modules XML parsing modules [both SAX/DOM]. PERL有许多模块XML解析模块[SAX / DOM]。

OR even Python is perfect choice for XML parsing. 甚至Python也是XML解析的最佳选择。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM