[英]Shell script to extract recursive xml tags
I have an XML file of form: 我有一个格式的XML文件:
...
<element1>
<element2>
<group1>
<tag1>value</tag1>
<tag2>value</tag2>
</group1>
<group1>
<tag1>value</tag1>
<tag2>value</tag2>
</group1>
<element2>
...
I used 我用了
sed -n '/\<group1\>/,\<\/group1>/p' filename
to extract all content of group1 tags, including all childs. 提取group1标记的所有内容,包括所有子项。 This is exactly what I want. 这正是我想要的。
<group1>
<tag1>value</tag1>
<tag2>value</tag2>
</group1>
<group1>
<tag1>value</tag1>
<tag2>value</tag2>
</group1>
However, if the input XML is of form 但是,如果输入XML的格式为
...
<element1>
<element2>
<group2>
<group2>value</group2>
<otherTag>value</otherTag>
</group2>
<element3>
<group2>
<group2>value</group2>
<otherTag>value</otherTag>
</group2>
...
And I tried to extract following content 我试图提取以下内容
<group2>
<group2>value</group2>
<otherTag>value</otherTag>
</group2>
<group2>
<group2>value</group2>
<otherTag>value</otherTag>
</group2>
The sed command above just returns: 上面的sed命令只返回:
<group2>
<group2>value</group2>
It understands the stop pattern </group2>
and do no more extraction. 它了解停止模式</group2>
并且不再进行提取。 I'm quite confused here. 我在这里很困惑。 Why doesn't it continue extracting the next <group2>
, as in <group1>
case. 为什么不继续提取下一个<group2>
,就像<group1>
一样。 Is there any way to make it work with sed? 有什么办法可以使其与sed一起使用? and any other alternatives? 还有其他选择吗?
您可以像这样更改sed
sed -n '/\<group1\>/,/^<\/group1>/p' filename | grep -v 'element3'
Far better to use XPath with a command line xpath interpreter, like xpath, xmlstarlet, my xidel or xmllint. 最好将XPath与命令行xpath解释器一起使用,例如xpath,xmlstarlet,我的xidel或xmllint。
All group elements on the 3rd level: 第三层上的所有组元素:
/elememt1/*/group1
All group elements that do not contain a group2: 所有不包含group2的组元素:
//group2[not(group2)]
Some like this? 像这样吗?
awk '/^<group2>/,/^<\/group2>/' file
<group2>
<group2>value</group2>
<otherTag>value</otherTag>
</group2>
<group2>
<group2>value</group2>
<otherTag>value</otherTag>
</group2>
This works if there are different spacing on the tag, if all is adjusted to the left, it will not work 如果标签上的间距不同,这将起作用,如果所有间距都向左调整,它将无法工作
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.