I have a large number of xml files to parse with xmllint. I just need to pull out the content of one or two nodes and dumb them in some new files.
I have no control over their format before they get to me.
I am trying to find a graceful way to handle characters like "&" (ampersand). They are not always escaped in the source xmls.
is there some way to handle this in a single xmllint command or do I need to prepare the xml files first?
I don't know about xmllint. But I do suggest to use other functions to do it. Or some script like html2text
may work too.
In my case I solved it with:
echo -e $(echo "$responseXml" | xmllint --xpath '/xpath/to/extract/message/text()' - 2>/dev/null | sed 's/\&#\(x..\);/\\\1/g') | iconv --from=iso88591
The iconv
may be unnecessary if your xml is not in ISO-8859-1
or if you don't want to convert it to UTF-8
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.