简体   繁体   中英

What is the Point of using XML CDATA?

I was reading up on XML files and came across this <![CDATA[]]> .

In what sort of situation would this be useful?

I understand it being used as;

All text in an XML document will be parsed by the parser.

But text inside a CDATA section will be ignored by the parser.

from here. However, It doesn't exactly go into any detail of when it may be useful and/or its relevance to xml files/etc.

This SO question asks what does it mean, but again, not too much detail from what i can see of what does it do nor when should i use it - which is why I am asking this question now.

(i'm not exactly a pro, nor an adept - ok, more of a complete idiot actually - even reading the docs didn't actually help, so any comprehensive answers would be great :P)

You can use it to avoid XML escaping special characters.

Imagine you have an element like

<data>...</data>

And want to place the following text in the data element :

 a < b

Like so:

<data>a < b</data> 

That doesn't work, since XML recognizes the < as a potential start of a new tag.

You can escape the < character:

<data>a &lt; b</data>

Or you can tell the XML parser to not parse your data by placing it in a CDATA section:

<data><![CDATA[a < b]]></data>

(Then again, with CDATA, your text cannot contain ]]> )

See also this question

<![CDATA[...]]> is a quick and dirty way to quote text in XML.

In XML, < , > , & have a special meaning. If you want to include a < or > in XML, you have to escape these as &lt; and &gt; . But if for example you include code in XML, you might use these characters a lot and don't want to write for instance c>='0' && c<='9' as c&gt;='0' &amp;&amp; c&lt;='9' c&gt;='0' &amp;&amp; c&lt;='9' . For these situations a more radical way has been introduced to escape text: whatever is between <![CDATA[ and ]]&gt; is to be interpreted verbatim. Only the sequence ]]> marks the end of the verbatim text.

The use of CDATA is invisible to the reader of XML. <this><![CDATA[a test]]></this> represents the same element as <this>a test</this> .

There is one big limitation. In a CDATA you can only represent the character available in your encoding (the encoding="..." in your <?xml> header). If you are using an encoding like ISO-8859-1, you cannot represent characters like € or œ.

So if you write XML manually and it contains code, it is a good idea to include the whole code in CDATA to prevent problems. So you can forget about escaping characters meaningful to XML.

But it is not a good idea to quote text programmatically with CDATA just because it is easier. You might end up loosing some special characters and some day you might have the sequence ]]> in your data. It is better to escape using &lt; &gt; &amp; and numeric entity codes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM