I'm developing a class for a content management system. The input content is supplied in XHTML format. And it can contain valid escaped characters like £
See the example below.
<html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head xmlns="">
<meta name="Attr_DocumentTitle" content="Hello World Books" />
</head>
<body>
<div>British Pound £</div>
<div>Registered sign ®</div>
<div>Copyright sign © </div>
</body>
</html>
My objective is to write a method that loads this to an XML .Net object do some processing and save to database. I want to maintain the escaped characters as they are. And here is my method:
public static XmlDocument LoadXmlFromString(string xhtmlContent)
{
byte[] xhtmlByte = Encoding.ASCII.GetBytes(xhtmlContent);
MemoryStream mStream = new MemoryStream(xhtmlByte);
XmlReaderSettings settings = new XmlReaderSettings();
//Upon loading XML, prevent DTD download, which would be blocked by our
//firewall and generate "503 Server Unavailable" error.
settings.XmlResolver = null;
settings.ProhibitDtd = false;
XmlReader reader = XmlReader.Create(mStream, settings);
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xhtmlContent);
return xmlDoc; //Value of xmlDoc.InnerXml contains £ ® © in place
// of £ ® and ©
}
This method however converts the escaped characters to their character equivalents. How can I avoid this and keep the escaped characters.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.