简体   繁体   English

如何使用 Nokogiri::XML::Builder 向 HTML 添加非转义符号

[英]How to add non-escaped ampersands to HTML with Nokogiri::XML::Builder

I would like to add things like bullet points "•" to HTML using the XML Builder in Nokogiri, but everything is being escaped.我想使用 Nokogiri 中的 XML Builder 将诸如项目符号“•”之类的内容添加到 HTML 中,但一切都被转义了。 How do I prevent it from being escaped?我如何防止它被逃脱?

I would like the result to be:我希望结果是:

<span>&#8226;</span> 

rather than:而不是:

<span>&amp;#8226;</span> 

I'm just doing this:我只是这样做:

xml.span { 
  xml.text "&#8226;\ " 
}

What am I missing?我错过了什么?

If you define如果你定义

  class Nokogiri::XML::Builder
    def entity(code)
      doc = Nokogiri::XML("<?xml version='1.0'?><root>&##{code};</root>")
      insert(doc.root.children.first)
    end
  end

then this那么这个

  builder = Nokogiri::XML::Builder.new do |xml|
    xml.span {
      xml.text "I can has "
      xml.entity 8665
      xml.text " entity?"
    }
  end
  puts builder.to_xml

yields产量

<?xml version="1.0"?>
<span>I can has &#x2022; entity?</span>

PS this a workaround only, for a clean solution please refer to the libxml2 documentation (Nokogiri is built on libxml2) for more help. PS这只是一种解决方法,对于一个干净的解决方案,请参阅libxml2文档(Nokogiri 建立在 libxml2 上)以获得更多帮助。 However, even these folks admit that handling entities can be quite ..err, cumbersome sometimes .然而,即使是这些人也承认处理实体可能会非常......错误,有时很麻烦

When you're setting the text of an element, you really are setting text, not HTML source.当您设置元素的文本时,您实际上是在设置文本,而不是 HTML 源代码。 < and & don't have any special meaning in plain text. <&在纯文本中没有任何特殊含义。

So just type a bullet: '•' .所以只需输入一个项目符号: '•' Of course your source code and your XML file will have to be using the same encoding for that to come out right.当然,您的源代码和 XML 文件必须使用相同的编码才能正确显示。 If your XML file is UTF-8 but your source code isn't, you'd probably have to say '\\xe2\\x80\\xa2' which is the UTF-8 byte sequence for the bullet character as a string literal.如果您的 XML 文件是 UTF-8 而您的源代码不是,您可能不得不说'\\xe2\\x80\\xa2'这是作为字符串文字的项目符号字符的 UTF-8 字节序列。

(In general non-ASCII characters in Ruby 1.8 are tricky. The byte-based interfaces don't mesh too well with XML's world of all-text-is-Unicode.) (一般来说,Ruby 1.8 中的非 ASCII 字符很棘手。基于字节的接口与 XML 的全文本即 Unicode 世界不太吻合。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM