简体   繁体   中英

Using sed or awk to extract text from xml file

<?xml version="1.0" encoding="utf-8"??>
<resources>
<data id="V701">
    <string name="MSG_V701_ID">V701</string>
    <string name="MSG_V701_TITLE">abc</string>
    <string name="MSG_V701_BODY">This title is currently unable</string>
</data>
<data id="V702">
    <string name="MSG_V702_ID">V702</string>
    <string name="MSG_V702_TITLE">Play</string>
    <string name="MSG_V702_BODY">This title is currently unable to play</string>

Using this xml i want to find values of all tags related to particular id

for eg id="V701" V701 abc This title is currently unable

for id="V702" V702 Play This title is currently unable to play i want to use this in bash script so please print output one per line

You are generally better off using a tool that understands XML to parse an XML file, rather than trying to parse it using things like sed or awk . For example, the xmllint command has a --xpath option that you can use to extract information from an XML file:

$ ID=V702
$ result=$(xmllint --xpath "//data[@id='$ID']" data.xml)
$ echo "$result"
<data id="V702">
    <string name="MSG_V702_ID">V702</string>
    <string name="MSG_V702_TITLE">Play</string>
    <string name="MSG_V702_BODY">This title is currently unable to play</string>
</data>

Or even:

$ result=$(xmllint --xpath "//data[@id='$ID']//text()" data.xml)
$ echo "$result"


V702


Play


This title is currently unable to play

If you want individual strings, you can do something like this:

title=$(xmllint --xpath "//data[@id='$ID']/string[@name='MSG_${ID}_TITLE']/text()" data.xml)
body=$(xmllint --xpath "//data[@id='$ID']/string[@name='MSG_${ID}_BODY']/text()" data.xml)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM