简体   繁体   中英

How to handle HTML entity nbsp in XSLT. Without changing the input file

I am trying to convert an HTML file into XML file using XSLT (Using Oxygen 9.0 for transformation).

When I configure and run the XSLT transformation with the HTML file then Oxygen outputs

The entity 'nbsp' was referenced,but not declared .

My input html file is:

<div><span>&nbsp;some text</span></div>

Note: I want to know how handle that entity only using the XSLT, I don't want to make any changes to the input file.

As far as I know, you're going to need to make changes to the input file.

Either by changing your &nbsp; to &#160; or by declaring a custom doctype that will do the conversion for you:

<!DOCTYPE doctypeName [
   <!ENTITY nbsp "&#160;">
]> 

This is because &nbsp; isn't one of XMLs predefined entities.

You could use XML Entities to create an XML file that defines the nbsp entity, and includes the (broken) XML fragment.

For example, assume that your fragment is saved as a file called: " invalid.xml "

<div><span>&nbsp;some text</span></div>

Create an XML file like this:

<!DOCTYPE wrapper [
   <!ENTITY nbsp "&#160;">
   <!ENTITY invalid-xml-document SYSTEM "./invalid.xml">
]><wrapper>
&invalid-xml-document;</wrapper>

When it that file gets parsed, it will have defined the nbsp entity, include the content from the "invalid.xml", and resolve the nbsp entity properly. The result is this:

<wrapper>
  <div>
    <span> some text</span> 
  </div>
</wrapper>

Then, just adjust your XSLT to accomodate the new document element (in this example the element <wrapper> ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM